AP-Loss for Accurate One-Stage Object Detection

被引:54
作者
Chen, Kean [1 ]
Lin, Weiyao [1 ]
Li, Jianguo [2 ]
See, John [3 ]
Wang, Ji [4 ]
Zou, Junni [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[2] Intel Labs, Beijing 100080, Peoples R China
[3] Multimedia Univ, Fac Comp & Informat, Cyberjaya 63100, Selangor, Malaysia
[4] Tencent YouTu Lab, Shanghai 200233, Peoples R China
基金
中国国家自然科学基金;
关键词
Detectors; Task analysis; Measurement; Optimization; Object detection; Training; Proposals; Computer vision; object detection; machine learning; ranking loss;
D O I
10.1109/TPAMI.2020.2991457
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One-stage object detectors are trained by optimizing classification-loss and localization-loss simultaneously, with the former suffering much from extreme foreground-background class imbalance issue due to the large number of anchors. This paper alleviates this issue by proposing a novel framework to replace the classification task in one-stage detectors with a ranking task, and adopting the average-precision loss (AP-loss) for the ranking problem. Due to its non-differentiability and non-convexity, the AP-loss cannot be optimized directly. For this purpose, we develop a novel optimization algorithm, which seamlessly combines the error-driven update scheme in perceptron learning and backpropagation algorithm in deep networks. We provide in-depth analyses on the good convergence property and computational complexity of the proposed algorithm, both theoretically and empirically. Experimental results demonstrate notable improvement in addressing the imbalance issue in object detection over existing AP-based optimization algorithms. An improved state-of-the-art performance is achieved in one-stage detectors based on AP-loss over detectors using classification-losses on various standard benchmarks. The proposed framework is also highly versatile in accommodating different network architectures. Code is available at https://github.com/cccorn/AP-loss.
引用
收藏
页码:3782 / 3798
页数:17
相关论文
共 67 条
  • [11] Cruz R, 2016, IEEE IJCNN, P2182, DOI 10.1109/IJCNN.2016.7727469
  • [12] Dai JF, 2016, ADV NEUR IN, V29
  • [13] Deformable Convolutional Networks
    Dai, Jifeng
    Qi, Haozhi
    Xiong, Yuwen
    Li, Yi
    Zhang, Guodong
    Hu, Han
    Wei, Yichen
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 764 - 773
  • [14] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [15] The PASCAL Visual Object Classes Challenge: A Retrospective
    Everingham, Mark
    Eslami, S. M. Ali
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136
  • [16] Fu C.Y., 2017, DSSD: deconvolutional single shot detector, DOI DOI 10.48550/ARXIV.1701.06659
  • [17] Object detection via a multi-region & semantic segmentation-aware CNN model
    Gidaris, Spyros
    Komodakis, Nikos
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1134 - 1142
  • [18] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [19] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [20] Goyal Priya, 2017, J CoRR, DOI DOI 10.48550/ARXIV.1706.02677