Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

被引：45256

作者：

Ren, Shaoqing ^{[1
]}

He, Kaiming ^{[2
]}

Girshick, Ross ^{[3
]}

Sun, Jian ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei 230026, Anhui, Peoples R China

[2] Microsoft Res, Visual Comp Grp, Beijing 100080, Peoples R China

[3] Facebook AI Res, Seattle, WA 98109 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2017年 / 39卷 / 06期

关键词：

Object detection; region proposal; convolutional neural network;

D O I：

10.1109/TPAMI.2016.2577031

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features-using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3], our detection system has a frame rate of 5 fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

引用

页码：1137 / 1149

页数：13

共 40 条

[21] Dai JF, 2015, PROC CVPR IEEE, P3992, DOI 10.1109/CVPR.2015.7299025
[22] Scalable Object Detection using Deep Neural Networks
Erhan, Dumitru
Szegedy, Christian
Toshev, Alexander
Anguelov, Dragomir
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2155 - 2162
[23] Everingham Everingham M. M., Int. J. Comput. Vis., V88 88, P303, DOI 10.1007/s11263-009-0275-4 10.1007/s11263-009-0275-4
[24] Object Detection with Discriminatively Trained Part-Based Models
Felzenszwalb, Pedro F.
Girshick, Ross B.
McAllester, David
Ramanan, Deva
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) : 1627 - 1645
[25] Fast R-CNN
Girshick, Ross
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
[26] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[27] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[28] Hoiem D, 2012, LECT NOTES COMPUT SC, V7574, P340, DOI 10.1007/978-3-642-33712-3_25
[29] What Makes for Effective Detection Proposals?
Hosang, Jan
Benenson, Rodrigo
Dollar, Piotr
Schiele, Bernt
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (04) : 814 - 830
[30] Caffe: Convolutional Architecture for Fast Feature Embedding
Jia, Yangqing
Shelhamer, Evan
Donahue, Jeff
Karayev, Sergey
Long, Jonathan
Girshick, Ross
Guadarrama, Sergio
Darrell, Trevor
[J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 675 - 678

← 1 2 3 4 →