Learning Multilayer Channel Features for Pedestrian Detection

被引：95

作者：

Cao, Jiale ^{[1
]}

Pang, Yanwei ^{[1
]}

Li, Xuelong ^{[2
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Chinese Acad Sci, Xian Inst Opt & Precis Mech, Ctr OPT IMagery Anal & Learning OPTIMAL, State Key Lab Transient Opt & Photon, Xian 710119, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2017年 / 26卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Pedestrian detection; multi-layer channel features (MCF); HOG plus LUV; CNN; NMS; GRADIENTS; DEEP;

D O I：

10.1109/TIP.2017.2694224

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pedestrian detection based on the combination of convolutional neural network (CNN) and traditional handcrafted features (i.e., HOG+LUV) has achieved great success. In general, HOG+LUV are used to generate the candidate proposals and then CNN classifies these proposals. Despite its success, there is still room for improvement. For example, CNN classifies these proposals by the fully connected layer features, while proposal scores and the features in the inner-layers of CNN are ignored. In this paper, we propose a unifying framework called multi-layer channel features (MCF) to overcome the drawback. It first integrates HOG+LUV with each layer of CNN into a multi-layer image channels. Based on the multi-layer image channels, a multi-stage cascade AdaBoost is then learned. The weak classifiers in each stage of the multi-stage cascade are learned from the image channels of corresponding layer. Experiments on Caltech data set, INRIA data set, ETH data set, TUD-Brussels data set, and KITTI data set are conducted. With more abundant features, an MCF achieves the state of the art on Caltech pedestrian data set (i.e., 10.40% miss rate). Using new and accurate annotations, an MCF achieves 7.98% miss rate. As many non-pedestrian detection windows can be quickly rejected by the first few stages, it accelerates detection speed by 1.43 times. By eliminating the highly overlapped detection windows with lower scores after the first stage, it is 4.07 times faster than negligible performance loss.

引用

页码：3210 / 3220

页数：11

共 60 条

[1]

[Anonymous], 2015, CORR

[2]

[Anonymous], 2014, P 27 INT C NEURAL IN

[3]

[Anonymous], 2013, P IEEE C COMP VIS PA

[4]

[Anonymous], IEEE T NEUR IN PRESS

[5]

[Anonymous], 2008, IEEE Conf. Comput. Vis. Pattern Recognit, DOI DOI 10.1109/CVPR.2008.4587581

[6]

[Anonymous], PROC CVPR IEEE

[7]

[Anonymous], IEEE T NEURAL NETW L

[8]

[Anonymous], IEEE T PATTERN ANAL

[9]

[Anonymous], IEEE T CYBE IN PRESS

[10]

[Anonymous], COLL CRIMINOLOGY RES

← 1 2 3 4 5 6 →