Soft-NMS - Improving Object Detection With One Line of Code

被引：1322

作者：

Bodla, Navaneeth ^{[1
]}

Singh, Bharat ^{[1
]}

Chellappa, Rama ^{[1
]}

Davis, Larry S. ^{[1
]}

机构：

[1] Univ Maryland, Ctr Automat Res, College Pk, MD 20742 USA

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年

关键词：

SCALE; EDGE;

D O I：

10.1109/ICCV.2017.593

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC 2007 (1.7% for both R-FCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for SoftNMS is publicly available on GitHub http://bit.ly/2nJLNMu.

引用

页码：5562 / 5570

页数：9

共 31 条

[1] A COMPUTATIONAL APPROACH TO EDGE-DETECTION [J].

CANNY, J .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1986, 8 (06) :679-698

[2] Total recall: Automatic query expansion with a generative feature model for object retrieval [J].

Chum, Ondrej ;

Philbin, James ;

Sivic, Josef ;

Isard, Michael ;

Zisserman, Andrew .

2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :496-+

[3]

Collins R.T., 2000, VSAM Final Rep, P1

[4] Deformable Convolutional Networks [J].

Dai, Jifeng ;

Qi, Haozhi ;

Xiong, Yuwen ;

Li, Yi ;

Zhang, Guodong ;

Hu, Han ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773

[5] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[6] Discriminative Models for Multi-Class Object Layout [J].

Desai, Chaitanya ;

Ramanan, Deva ;

Fowlkes, Charless C. .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 95 (01) :1-12

[7]

Dollár P, 2009, PROC CVPR IEEE, P304, DOI 10.1109/CVPRW.2009.5206631

[8] Individualness and Determinantal Point Processes for Pedestrian Detection [J].

Lee, Donghoon ;

Cha, Geonho ;

Yang, Ming-Hsuan ;

Oh, Songhwai .

COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :330-346

[9]

Everingham Mark, 2010, INT J COMPUT VISION, V88, P303, DOI DOI 10.1007/s11263-009-0275-4

[10] Object Detection with Discriminatively Trained Part-Based Models [J].

Felzenszwalb, Pedro F. ;

Girshick, Ross B. ;

McAllester, David ;

Ramanan, Deva .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645

← 1 2 3 4 →