Analysis and synthesis of textured motion: Particles and waves

被引:24
作者
Wang, YZ
Zhu, SC
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
基金
美国国家科学基金会;
关键词
textured motion; generative model; texton; statistical learning; object tracking; stochastic gradient;
D O I
10.1109/TPAMI.2004.76
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural scenes contain a wide range of textured motion phenomena which are characterized by the movement of a large amount of particle and wave elements, such as falling snow, wavy water, and dancing grass. In this paper, we present a generative model for representing these motion patterns and study a Markov chain Monte Carlo algorithm for inferring the generative representation from observed video sequences. Our generative model consists of three components. The first is a photometric model which represents an image as a linear superposition of image bases selected from a generic and overcomplete dictionary. The dictionary contains Gabor and LoG bases for point/particle elements and Fourier bases for wave elements. These bases compete to explain the input images and transfer them to a token (base) representation with an O(10(2))- fold dimension reduction. The second component is a geometric model which groups spatially adjacent tokens (bases) and their motion trajectories into a number of moving elements - called "motons." A moton is a deformable template in time-space representing a moving element, such as a falling snowflake or a flying bird. The third component is a dynamic model which characterizes the motion of particles, waves, and their interactions. For example, the motion of particle objects floating in a river, such as leaves and balls, should be coupled with the motion of waves. The trajectories of these moving elements are represented by coupled Markov chains. The dynamic model also includes probabilistic representations for the birth/death (source/sink) of the motons. We adopt a stochastic gradient algorithm for learning and inference. Given an input video sequence, the algorithm iterates two steps: 1) computing the motons and their trajectories by a number of reversible Markov chain jumps, and 2) learning the parameters that govern the geometric deformations and motion dynamics. Novel video sequences are synthesized from the learned models and, by editing the model parameters, we demonstrate the controllability of the generative model.
引用
收藏
页码:1348 / 1363
页数:16
相关论文
共 33 条
[1]  
[Anonymous], 1983, Vision
[2]  
[Anonymous], 1965, BORES BREAKERS WAVES
[3]  
[Anonymous], P SIGGRAPH
[4]  
BARJOSEPH Z, 2001, IEEE T VISUALIZATION, V7
[5]  
BREGLER C, 2000, P SIGGRAPH
[6]   SPACE-TIME MODELLING WITH AN APPLICATION TO REGIONAL FORECASTING [J].
CLIFF, AD ;
ORD, JK .
TRANSACTIONS OF THE INSTITUTE OF BRITISH GEOGRAPHERS, 1975, (64) :119-128
[7]  
EBERT DS, 1990, P SIGGRAPH
[8]  
Efros A. A., 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision, P1033, DOI 10.1109/ICCV.1999.790383
[9]   WHAT IS THE GOAL OF SENSORY CODING [J].
FIELD, DJ .
NEURAL COMPUTATION, 1994, 6 (04) :559-601
[10]  
Fitzgibbon AW, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS, P662, DOI 10.1109/ICCV.2001.937584