Regionlets for Generic Object Detection

被引：78

作者：

Wang, Xiaoyu ^{[1
]}

Yang, Ming ^{[2
]}

Zhu, Shenghuo ^{[3
]}

Lin, Yuanqing ^{[1
]}

机构：

[1] NEC Labs Amer, Dept Media Analyt, Cupertino, CA 95014 USA

[2] Facebook Inc, AI Res, Menlo Pk, CA USA

[3] Alibaba Grp, Hangzhou, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2015年 / 37卷 / 10期

关键词：

Object detection; regionlet; boosting; object proposals; selective search; deep convolutional neural network; RECOGNITION; HISTOGRAMS; GRADIENTS;

D O I：

10.1109/TPAMI.2015.2389830

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generic object detection is confronted by dealing with different degrees of variations, caused by viewpoints or deformations in distinct object classes, with tractable computations. This demands for descriptive and flexible object representations which can be efficiently evaluated in many locations. We propose to model an object class with a cascaded boosting classifier which integrates various types of features from competing local regions, each of which may consist of a group of subregions, named as regionlets. A regionlet is a base feature extraction region defined proportionally to a detection window at an arbitrary resolution (i.e., size and aspect ratio). These regionlets are organized in small groups with stable relative positions to be descriptive to delineate fine-grained spatial layouts inside objects. Their features are aggregated into a one-dimensional feature within one group so as to be flexible to tolerate deformations. The most discriminative regionlets for each object class are selected through a boosting learning procedure. Our regionlet approach achieves very competitive performance on popular multi-class detection benchmark datasets with a single method, without any context. It achieves a detection mean average precision of 41.7 percent on the PASCAL VOC 2007 dataset, and 39.7 percent on the VOC 2010 for 20 object categories. We further develop support pixel integral images to efficiently augment regionlet features with the responses learned by deep convolutional neural networks. Our regionlet based method won second place in the ImageNet Large Scale Visual Object Recognition Challenge (ILSVRC 2013).

引用

页码：2071 / 2084

页数：14

共 46 条

[1]

Ahonen T, 2004, LECT NOTES COMPUT SC, V3021, P469

[2] Measuring the Objectness of Image Windows [J].

Alexe, Bogdan ;

Deselaers, Thomas ;

Ferrari, Vittorio .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2189-2202

[3]

Alexe B, 2010, PROC CVPR IEEE, P73, DOI 10.1109/CVPR.2010.5540226

[4]

[Anonymous], LARGE SCALE VISUAL R

[5]

[Anonymous], 2014, 2 INT C LEARN REPR I

[6]

[Anonymous], 2013, 31 INT C MACH LEARN

[7]

[Anonymous], 2017, COMMUN ACM, DOI [DOI 10.1145/3065386, 10.1145/3065386]

[8]

[Anonymous], 2007, 2007 IEEE C COMP VIS, DOI DOI 10.1109/CVPR.2007.383197

[9]

[Anonymous], PROC CVPR IEEE

[10]

[Anonymous], NEUR INF PROC SYST L

← 1 2 3 4 5 →