Hierarchical Shot Detector

被引:65
作者
Cao, Jiale [1 ]
Pang, Yanwei [1 ]
Han, Jungong [2 ]
Li, Xuelong [3 ]
机构
[1] Tianjin Univ, Tianjin, Peoples R China
[2] Univ Warwick, Coventry, W Midlands, England
[3] Northwestern Polytech Univ, Fremont, CA USA
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
D O I
10.1109/ICCV.2019.00980
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single shot detector simultaneously predicts object categories and regression offsets of the default boxes. Despite of high efficiency, this structure has some inappropriate designs: (1) The classification result of the default box is improperly assigned to that of the regressed box during inference, (2) Only regression once is not good enough for accurate object detection. To solve the first problem, a novel reg-offset-cls (ROC) module is proposed. It contains three hierarchical steps: box regression, the feature sampling location predication, and the regressed box classification with the features of offset locations. To further solve the second problem, a hierarchical shot detector (HSD) is proposed, which stacks two ROC modules and one feature enhanced module. The second ROC treats the regressed boxes and the feature sampling locations of features in the first ROC as the inputs. Meanwhile, the feature enhanced module injected between two ROCs aims to extract the local and non-local context. Experiments on the MS COCO and PASCAL VOC datasets demonstrate the superiority of proposed HSD. Without the bells or whistles, HSD outperforms all one-stage methods at real-time speed.
引用
收藏
页码:9704 / 9713
页数:10
相关论文
共 59 条
  • [1] [Anonymous], 2015, Arxiv.Org, DOI DOI 10.3389/FPSYG.2013.00124
  • [2] [Anonymous], P EUR C COMP VIS
  • [3] Cascade R-CNN: Delving into High Quality Object Detection
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
  • [4] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
    Cai, Zhaowei
    Fan, Quanfu
    Feris, Rogerio S.
    Vasconcelos, Nuno
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 354 - 370
  • [5] Cao J., 2019, P IEEE C COMP VIS PA
  • [6] Learning Multilayer Channel Features for Pedestrian Detection
    Cao, Jiale
    Pang, Yanwei
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) : 3210 - 3220
  • [7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [8] Backtracking Spatial Pyramid Pooling-Based Image Classifier for Weakly Supervised Top-Down Salient Object Detection
    Cholakkal, Hisham
    Johnson, Jubin
    Rajan, Deepu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (12) : 6064 - 6078
  • [9] Cholakkal Hisham, 2019, P IEEE C COMP VIS PA
  • [10] Deformable Convolutional Networks
    Dai, Jifeng
    Qi, Haozhi
    Xiong, Yuwen
    Li, Yi
    Zhang, Guodong
    Hu, Han
    Wei, Yichen
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 764 - 773