Visual Object Detection with Deformable Part Models

被引:22
作者
Felzenszwalb, Pedro [1 ,2 ]
Girshick, Ross [3 ]
McAllester, David [4 ]
Ramanan, Deva [5 ]
机构
[1] Brown Univ, Sch Engn, Providence, RI 02912 USA
[2] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
[3] Univ Calif Berkeley, EECS, Berkeley, CA USA
[4] Toyota Technol Inst, Chicago, IL USA
[5] UC Irvine, Dept Comp Sci, Irvine, CA USA
基金
美国国家科学基金会;
关键词
RECOGNITION;
D O I
10.1145/2500468.2494532
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a state-of-the-art system for finding objects in cluttered images. Our system is based on deformable models that represent objects using local part templates and geometric constraints on the locations of parts. We reduce object detection to classification with latent variables. The latent variables introduce invariances that make it possible to detect objects with highly variable appearance. We use a generalization of support vector machines to incorporate latent information during training. This has led to a general framework for discriminative training of classifiers with latent variables. Discriminative training benefits from large training datasets. In practice we use an iterative algorithm that alternates between estimating latent values for positive examples and solving a large convex optimization problem. Practical optimization of this large convex problem can be done using active set techniques for adaptive subsampling of the training data.
引用
收藏
页码:97 / 105
页数:9
相关论文
共 34 条
[21]  
Girshick R., 2011, Advances in neural information processing systems, V24
[22]   COMPARING IMAGES USING THE HAUSDORFF DISTANCE [J].
HUTTENLOCHER, DP ;
KLANDERMAN, GA ;
RUCKLIDGE, WJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (09) :850-863
[23]  
Lamdan Y., 1988, IEEE INT C COMP VIS
[24]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[25]   3-DIMENSIONAL OBJECT RECOGNITION FROM SINGLE TWO-DIMENSIONAL IMAGES [J].
LOWE, DG .
ARTIFICIAL INTELLIGENCE, 1987, 31 (03) :355-395
[26]  
Marr D., 1978, P ROY SOC LOND B BIO, V200, p[1140, 269]
[27]  
Mundy J.L., 1992, Geometric invariance in computer vision, V92
[28]   VISUAL LEARNING AND RECOGNITION OF 3-D OBJECTS FROM APPEARANCE [J].
MURASE, H ;
NAYAR, SK .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 14 (01) :5-24
[29]  
Schneiderman H., 2000, IEEE C COMP VIS PATT
[30]   Example-based learning for view-based human face detection [J].
Sung, KK ;
Poggio, T .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (01) :39-51