Robust visual tracking by integrating multiple cues based on co-inference learning

被引:112
作者
Wu, Y
Huang, TS
机构
[1] Northwestern Univ, Dept Elect & Comp Engn, Evanston, IL 60208 USA
[2] Univ Illinois, Beckman Inst, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
visual tracking; sequential Monte Carlo; importance sampling; co-inference; factorized graphical model; variational analysis;
D O I
10.1023/B:VISI.0000016147.97880.cd
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual tracking can be treated as a parameter estimation problem that infers target states based on image observations from video sequences. A richer target representation may incur better chances of successful tracking in cluttered and dynamic environments, and thus enhance the robustness. Richer representations can be constructed by either specifying a detailed model of a single cue or combining a set of rough models of multiple cues. Both approaches increase the dimensionality of the state space, which results in a dramatic increase of computation. To investigate the integration of rough models from multiple cues and to explore computationally efficient algorithms, this paper formulates the problem of multiple cue integration and tracking in a probabilistic framework based on a factorized graphical model. Structured variational analysis of such a graphical model factorizes different modalities and suggests a co-inference process among these modalities. Based on the importance sampling technique, a sequential Monte Carlo algorithm is proposed to provide an efficient simulation and approximation of the co-inferencing of multiple cues. This algorithm runs in real-time at around 30 Hz. Our extensive experiments show that the proposed algorithm performs robustly in a large variety of tracking scenarios. The approach presented in this paper has the potential to solve other problems including sensor fusion problems.
引用
收藏
页码:55 / 71
页数:17
相关论文
共 41 条
[1]  
[Anonymous], 2000, Sequential Monte Carlo Methods in Practice
[2]  
[Anonymous], P EUR C COMP VIS
[3]   Reliable tracking of human arm dynamics by multiple cue integration and constraint fusion [J].
Azoz, Y ;
Devi, L ;
Sharma, R .
1998 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1998, :905-910
[4]   Elliptical head tracking using intensity gradients and color histograms [J].
Birchfield, S .
1998 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1998, :232-237
[5]  
BLACK M, 1996, P EUR C COMP VIS, V1, P343
[6]  
BLAKE A., 1998, Active Contours
[7]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[8]   Learning and recognizing human dynamics in video sequences [J].
Bregler, C .
1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, :568-574
[9]  
CHAM TJ, 1999, P COMP VIS PATT REC, V2, P239
[10]  
Comaniciu D, 2000, PROC CVPR IEEE, P142, DOI 10.1109/CVPR.2000.854761