Focal and efficient IOU loss for accurate bounding box regression

被引:822
作者
Zhang, Yi-Fan [1 ,2 ,3 ]
Ren, Weiqiang [4 ]
Zhang, Zhang [1 ,2 ,3 ]
Jia, Zhen [1 ,2 ]
Wang, Liang [1 ,2 ,3 ]
Tan, Tieniu [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci CASIA, Inst Automat, CRIPAC, Beijing, Peoples R China
[2] Chinese Acad Sci CASIA, Inst Automat, NLPR, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Horizon Robot, Beijing, Peoples R China
关键词
Object detection; Loss function design; Hard sample mining;
D O I
10.1016/j.neucom.2022.07.042
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In object detection, bounding box regression (BBR) is a crucial step that determines the object localization performance. However, we find that most previous loss functions for BBR have two main drawbacks: (i) Both `n-norm and IOU-based loss functions are inefficient to depict the objective of BBR, which leads to slow convergence and inaccurate regression results. (ii) Most of the loss functions ignore the imbalance problem in BBR that the large number of anchor boxes which have small overlaps with the target boxes contribute most to the optimization of BBR. To mitigate the adverse effects caused thereby, we perform thorough studies to exploit the potential of BBR losses in this paper. Firstly, an Efficient Intersection over Union (EIOU) loss is proposed, which explicitly measures the discrepancies of three geometric factors in BBR, i.e., the overlap area, the central point and the side length. After that, we state the Effective Example Mining (EEM) problem and propose a regression version of focal loss to make the regression process focus on high-quality anchor boxes. Finally, the above two parts are combined to obtain a new loss function, namely Focal-EIOU loss. Extensive experiments on both synthetic and real datasets are performed. Notable superiorities on both the convergence speed and the localization accuracy can be achieved over other BBR losses. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:146 / 157
页数:12
相关论文
共 32 条
  • [11] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [12] Han Qiu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P549, DOI 10.1007/978-3-030-58452-8_32
  • [13] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
  • [14] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) : 1904 - 1916
  • [15] Acquisition of Localization Confidence for Accurate Object Detection
    Jiang, Borui
    Luo, Ruixuan
    Mao, Jiayuan
    Xiao, Tete
    Jiang, Yuning
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 816 - 832
  • [16] A note on the triangle inequality for the Jaccard distance
    Kosub, Sven
    [J]. PATTERN RECOGNITION LETTERS, 2019, 120 : 36 - 38
  • [17] Event-triggered controller via adaptive output-feedback for a class of uncertain nonlinear systems
    Li, Hui
    Liu, Yungang
    Huang, Yaxin
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 2021, 94 (09) : 2575 - 2583
  • [18] Li X, 2020, Arxiv, DOI [arXiv:2006.04388, DOI 10.48550/ARXIV.2006.04388ABS/2006.04388]
  • [19] Focal Loss for Dense Object Detection
    Lin, Tsung-Yi
    Goyal, Priya
    Girshick, Ross
    He, Kaiming
    Dollar, Piotr
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2999 - 3007
  • [20] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755