AP-Loss for Accurate One-Stage Object Detection

被引：54

作者：

Chen, Kean ^{[1
]}

Lin, Weiyao ^{[1
]}

Li, Jianguo ^{[2
]}

See, John ^{[3
]}

Wang, Ji ^{[4
]}

Zou, Junni ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China

[2] Intel Labs, Beijing 100080, Peoples R China

[3] Multimedia Univ, Fac Comp & Informat, Cyberjaya 63100, Selangor, Malaysia

[4] Tencent YouTu Lab, Shanghai 200233, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2021年 / 43卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Detectors; Task analysis; Measurement; Optimization; Object detection; Training; Proposals; Computer vision; object detection; machine learning; ranking loss;

D O I：

10.1109/TPAMI.2020.2991457

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

One-stage object detectors are trained by optimizing classification-loss and localization-loss simultaneously, with the former suffering much from extreme foreground-background class imbalance issue due to the large number of anchors. This paper alleviates this issue by proposing a novel framework to replace the classification task in one-stage detectors with a ranking task, and adopting the average-precision loss (AP-loss) for the ranking problem. Due to its non-differentiability and non-convexity, the AP-loss cannot be optimized directly. For this purpose, we develop a novel optimization algorithm, which seamlessly combines the error-driven update scheme in perceptron learning and backpropagation algorithm in deep networks. We provide in-depth analyses on the good convergence property and computational complexity of the proposed algorithm, both theoretically and empirically. Experimental results demonstrate notable improvement in addressing the imbalance issue in object detection over existing AP-based optimization algorithms. An improved state-of-the-art performance is achieved in one-stage detectors based on AP-loss over detectors using classification-losses on various standard benchmarks. The proposed framework is also highly versatile in accommodating different network architectures. Code is available at https://github.com/cccorn/AP-loss.

引用

页码：3782 / 3798

页数：17

共 67 条

[11] Cruz R, 2016, IEEE IJCNN, P2182, DOI 10.1109/IJCNN.2016.7727469
[12] Dai JF, 2016, ADV NEUR IN, V29
[13] Deformable Convolutional Networks
Dai, Jifeng
Qi, Haozhi
Xiong, Yuwen
Li, Yi
Zhang, Guodong
Hu, Han
Wei, Yichen
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 764 - 773
[14] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[15] The PASCAL Visual Object Classes Challenge: A Retrospective
Everingham, Mark
Eslami, S. M. Ali
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136
[16] Fu C.Y., 2017, DSSD: deconvolutional single shot detector, DOI DOI 10.48550/ARXIV.1701.06659
[17] Object detection via a multi-region & semantic segmentation-aware CNN model
Gidaris, Spyros
Komodakis, Nikos
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1134 - 1142
[18] Fast R-CNN
Girshick, Ross
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
[19] Rich feature hierarchies for accurate object detection and semantic segmentation
Girshick, Ross
Donahue, Jeff
Darrell, Trevor
Malik, Jitendra
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
[20] Goyal Priya, 2017, J CoRR, DOI DOI 10.48550/ARXIV.1706.02677

← 1 2 3 4 5 6 7 →