Region-Based Convolutional Networks for Accurate Object Detection and Segmentation

被引：1956

作者：

Girshick, Ross ^{[1
]}

Donahue, Jeff ^{[2
]}

Darrell, Trevor ^{[2
]}

Malik, Jitendra ^{[2
]}

机构：

[1] Microsoft Res, Redmond, WA 98052 USA

[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2016年 / 38卷 / 01期

基金：

美国国家科学基金会;

关键词：

Object recognition; detection; semantic segmentation; convolutional networks; deep learning; transfer learning; REPRESENTATION; HISTOGRAMS; GRADIENTS; FEATURES; SCENE;

D O I：

10.1109/TPAMI.2015.2437384

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 percent relative to the previous best result on VOC 2012-achieving a mAP of 62.4 percent. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at http://www.cs.berkeley.edu/similar to rbg/rcnn.

引用

页码：142 / 158

页数：17

共 75 条

[1]

Agrawal P, 2014, LECT NOTES COMPUT SC, V8695, P329, DOI 10.1007/978-3-319-10584-0_22

[2] Measuring the Objectness of Image Windows [J].

Alexe, Bogdan ;

Deselaers, Thomas ;

Ferrari, Vittorio .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2189-2202

[3]

[Anonymous], P 10 INT C MACH LEAR

[4]

[Anonymous], ARXIV14090575V1CSCV

[5]

[Anonymous], 1994, 1521 AI MIT

[6]

[Anonymous], 2009, P ACM INT C IM VID R, DOI DOI 10.1145/1646396.1646421

[7]

[Anonymous], 2011, ADV NEURAL INF PROCE

[8]

[Anonymous], 2015, P 3 INT C LEARN REPR

[9]

[Anonymous], ARXIV150408083V1CSCV

[10]

[Anonymous], PROC CVPR IEEE

← 1 2 3 4 5 6 7 8 →