Probabilistic recognition of human faces from video

被引:150
作者
Zhou, SH [1 ]
Krueger, V
Chellappa, R
机构
[1] Univ Maryland, Ctr Automat Res, Dept Elect & Comp Engn, College Pk, MD 20742 USA
[2] Aalborg Univ, Aalborg, Denmark
关键词
face recognition; still-to-video; video-to-video; time series state space model; sequential importance sampling; exemplar-based learning;
D O I
10.1016/S1077-3142(03)00080-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Recognition of human faces using a gallery of still or video images and a probe set of videos is systematically investigated using a probabilistic framework. In still-to-video recognition, where the gallery consists of still images, a time series state space model is proposed to fuse temporal information in a probe video, which simultaneously characterizes the kinematics and identity using a motion vector and an identity variable, respectively. The joint posterior distribution of the motion vector and the identity variable is estimated at each time instant and then propagated to the next time instant. Marginalization over the motion vector yields a robust estimate of the posterior distribution of the identity variable. A computationally efficient sequential importance sampling (SIS) algorithm is developed to estimate the posterior distribution. Empirical results demonstrate that, due to the propagation of the identity variable over time, a degeneracy in posterior probability of the identity variable is achieved to give improved recognition. The gallery is generalized to videos in order to realize video-to-video recognition. An exemplar-based learning strategy is adopted to automatically select video representatives from the gallery, serving as mixture centers in an updated likelihood measure. The SIS algorithm is applied to approximate the posterior distribution of the motion vector, the identity variable, and the exemplar index, whose marginal distribution of the identity variable produces the recognition result. The model formulation is very general and it allows a variety of image representations and transformations. Experimental results using images/videos collected at UMD, NIST/USF, and CMU with pose/illumination variations illustrate the effectiveness of this approach for both still-to-video and video-to-video scenarios with appropriate model choices. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:214 / 245
页数:32
相关论文
共 36 条
[1]
Anderson B., 1979, OPTIMAL FILTERING
[2]
[Anonymous], P EUR C COMP VIS
[3]
Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection [J].
Belhumeur, PN ;
Hespanha, JP ;
Kriegman, DJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) :711-720
[4]
BLACK MJ, 1999, P ICCV, P176
[5]
BOYEN X, 1998, P 14 ANN C UNC ART I, P33
[6]
HUMAN AND MACHINE RECOGNITION OF FACES - A SURVEY [J].
CHELLAPPA, R ;
WILSON, CL ;
SIROHEY, S .
PROCEEDINGS OF THE IEEE, 1995, 83 (05) :705-740
[7]
Choudhury Tanzeem, 1998, P INT C AUD VID BAS, P176
[8]
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[9]
On sequential Monte Carlo sampling methods for Bayesian filtering [J].
Doucet, A ;
Godsill, S ;
Andrieu, C .
STATISTICS AND COMPUTING, 2000, 10 (03) :197-208
[10]
Discriminant analysis for recognition of human face images [J].
Etemad, K ;
Chellappa, R .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1997, 14 (08) :1724-1733