Features versus Context: An Approach for Precise and Detailed Detection and Delineation of Faces and Facial Features

被引:90
作者
Ding, Liya [1 ]
Martinez, Aleix M. [1 ]
机构
[1] Ohio State Univ, Dept Elect & Comp Engn, Dreese Lab 205, Columbus, OH 43210 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Face detection; facial feature detection; shape extraction; subclass learning; discriminant analysis; adaptive boosting; face recognition; American sign language; nonmanuals; DISCRIMINANT-ANALYSIS; PEDESTRIAN DETECTION; OBJECT DETECTION; CASCADE; IMAGES; MODELS; CLASSIFICATION; RECOGNITION; VIEW;
D O I
10.1109/TPAMI.2010.28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The appearance-based approach to face detection has seen great advances in the last several years. In this approach, we learn the image statistics describing the texture pattern (appearance) of the object class we want to detect, e.g., the face. However, this approach has had limited success in providing an accurate and detailed description of the internal facial features, i.e., eyes, brows, nose, and mouth. In general, this is due to the limited information carried by the learned statistical model. While the face template is relatively rich in texture, facial features (e.g., eyes, nose, and mouth) do not carry enough discriminative information to tell them apart from all possible background images. We resolve this problem by adding the context information of each facial feature in the design of the statistical model. In the proposed approach, the context information defines the image statistics most correlated with the surroundings of each facial component. This means that when we search for a face or facial feature, we look for those locations which most resemble the feature yet are most dissimilar to its context. This dissimilarity with the context features forces the detector to gravitate toward an accurate estimate of the position of the facial feature. Learning to discriminate between feature and context templates is difficult, however, because the context and the texture of the facial features vary widely under changing expression, pose, and illumination, and may even resemble one another. We address this problem with the use of subclass divisions. We derive two algorithms to automatically divide the training samples of each facial feature into a set of subclasses, each representing a distinct construction of the same facial component (e.g., closed versus open eyes) or its context (e. g., different hairstyles). The first algorithm is based on a discriminant analysis formulation. The second algorithm is an extension of the AdaBoost approach. We provide extensive experimental results using still images and video sequences for a total of 3,930 images. We show that the results are almost as good as those obtained with manual detection.
引用
收藏
页码:2022 / 2038
页数:17
相关论文
共 57 条
[1]  
[Anonymous], 1998, Statistical shape analysis
[2]  
[Anonymous], P EUR C COMP VIS WOR
[3]  
[Anonymous], P IEEE C COMP VIS PA
[4]  
[Anonymous], 1999, 2 INT C AUD VID BAS
[5]  
[Anonymous], P IEEE C COMP VIS PA
[6]   Fully automatic facial action recognition in spontaneous behavior [J].
Bartlett, Marian Stewart ;
Littlewort, Gwen ;
Frank, Mark ;
Lainscsek, Claudia ;
Fasel, Ian ;
Movellan, Javier .
PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION - PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE, 2006, :223-+
[7]  
Bartlett PL, 2007, J MACH LEARN RES, V8, P2347
[8]  
Carneiro G, 2008, P IEEE C COMP VIS PA
[9]   ACTIVE SHAPE MODELS - THEIR TRAINING AND APPLICATION [J].
COOTES, TF ;
TAYLOR, CJ ;
COOPER, DH ;
GRAHAM, J .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1995, 61 (01) :38-59
[10]   Active appearance models [J].
Cootes, TF ;
Edwards, GJ ;
Taylor, CJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (06) :681-685