Soft-NMS - Improving Object Detection With One Line of Code

被引:1322
作者
Bodla, Navaneeth [1 ]
Singh, Bharat [1 ]
Chellappa, Rama [1 ]
Davis, Larry S. [1 ]
机构
[1] Univ Maryland, Ctr Automat Res, College Pk, MD 20742 USA
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年
关键词
SCALE; EDGE;
D O I
10.1109/ICCV.2017.593
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC 2007 (1.7% for both R-FCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for SoftNMS is publicly available on GitHub http://bit.ly/2nJLNMu.
引用
收藏
页码:5562 / 5570
页数:9
相关论文
共 31 条
[2]   Total recall: Automatic query expansion with a generative feature model for object retrieval [J].
Chum, Ondrej ;
Philbin, James ;
Sivic, Josef ;
Isard, Michael ;
Zisserman, Andrew .
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :496-+
[3]  
Collins R.T., 2000, VSAM Final Rep, P1
[4]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[5]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[6]   Discriminative Models for Multi-Class Object Layout [J].
Desai, Chaitanya ;
Ramanan, Deva ;
Fowlkes, Charless C. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 95 (01) :1-12
[7]  
Dollár P, 2009, PROC CVPR IEEE, P304, DOI 10.1109/CVPRW.2009.5206631
[8]   Individualness and Determinantal Point Processes for Pedestrian Detection [J].
Lee, Donghoon ;
Cha, Geonho ;
Yang, Ming-Hsuan ;
Oh, Songhwai .
COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :330-346
[9]  
Everingham Mark, 2010, INT J COMPUT VISION, V88, P303, DOI DOI 10.1007/s11263-009-0275-4
[10]   Object Detection with Discriminatively Trained Part-Based Models [J].
Felzenszwalb, Pedro F. ;
Girshick, Ross B. ;
McAllester, David ;
Ramanan, Deva .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645