Discriminative Appearance Models for Pictorial Structures

被引：45

作者：

Andriluka, Mykhaylo ^{[1
]}

Roth, Stefan ^{[2
]}

Schiele, Bernt ^{[1
]}

机构：

[1] MPI Informat, Stuhlsatzenhausweg 85, D-66123 Saarbrucken, Germany

[2] Tech Univ Darmstadt, Dept Comp Sci, D-64283 Darmstadt, Germany

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2012年 / 99卷 / 03期

关键词：

Object detection; People detection; Articulated pose estimation; Pictorial structures; Discriminative models;

D O I：

10.1007/s11263-011-0498-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we consider people detection and articulated pose estimation, two closely related and challenging problems in computer vision. Conceptually, both of these problems can be addressed within the pictorial structures framework (Felzenszwalb and Huttenlocher in Int. J. Comput. Vis. 61(1):55-79, 2005; Fischler and Elschlager in IEEE Trans. Comput. C-22(1):67-92, 1973), even though previous approaches have not shown such generality. A principal difficulty for such a general approach is to model the appearance of body parts. The model has to be discriminative enough to enable reliable detection in cluttered scenes and general enough to capture highly variable appearance. Therefore, as the first important component of our approach, we propose a discriminative appearance model based on densely sampled local descriptors and AdaBoost classifiers. Secondly, we interpret the normalized margin of each classifier as likelihood in a generative model and compute marginal posteriors for each part using belief propagation. Thirdly, non-Gaussian relationships between parts are represented as Gaussians in the coordinate system of the joint between the parts. Additionally, in order to cope with shortcomings of tree-based pictorial structures models, we augment our model with additional repulsive factors in order to discourage overcounting of image evidence. We demonstrate that the combination of these components within the pictorial structures framework results in a generic model that yields state-of-the-art performance for several datasets on a variety of tasks: people detection, upper body pose estimation, and full body pose estimation.

引用

页码：259 / 280

页数：22

共 56 条

[11]

Eichner Marcin., 2009, BMVC

[12]

Everingham Mark, 2007, The PASCAL visual object classes challenge 2008 (VOC2008) results

[13]

Felzenszwalb P. F., 2008, IEEE C COMP VIS PAT

[14] Object Detection with Discriminatively Trained Part-Based Models [J].

Felzenszwalb, Pedro F. ;

Girshick, Ross B. ;

McAllester, David ;

Ramanan, Deva .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645

[15] Pictorial structures for object recognition [J].

Felzenszwalb, PF ;

Huttenlocher, DP .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2005, 61 (01) :55-79

[16]

Ferrari V., 2009, IEEE C COMP VIS PAT

[17] Progressive search space reduction for human pose estimation [J].

Ferrari, Vittorio ;

Marin-Jimenez, Manuel ;

Zisserman, Andrew .

2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008,

[18]

Ferrari V, 2009, LECT NOTES COMPUT SC, V5604, P128, DOI 10.1007/978-3-642-03061-1_7

[19] REPRESENTATION AND MATCHING OF PICTORIAL STRUCTURES [J].

FISCHLER, MA ;

ELSCHLAGER, RA .

IEEE TRANSACTIONS ON COMPUTERS, 1973, C 22 (01) :67-92

[20] A decision-theoretic generalization of on-line learning and an application to boosting [J].

Freund, Y ;

Schapire, RE .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139

← 1 2 3 4 5 6 →