Real-time multimedia streams like audio and video are now integral data types in modern programming environments. Although a great deal of research has investigated effective and efficient programming support for manipulating such streams and although the design of digital media ''middleware'' is fairly well understood, no widely available or commonly accepted programming model exists within the research community. We believe this lack of common practice impedes our collective progress because it prevents disparate research groups from easily leveraging each other's work. In this paper, we propose a solution to this problem that combines the best features of a number of existing multimedia toolkits Berkeley's Continuous Media Toolkit, MIT's VuSystem, and the LBL/UCB MBone tools into a fine-grained, extensible, and high-performance toolkit. We describe the convergence of these three toolkits into a common programming infrastructure and argue that the availability and acceptance of our middleware could potentially facilitate and accelerate breakthroughs in multimedia networking.