Convolutional Random Walk Networks for Semantic Image Segmentation

被引：94

作者：

Bertasius, Gedas ^{[1
]}

Torresani, Lorenzo ^{[2
]}

Yu, Stella X. ^{[3
]}

Shi, Jianbo ^{[1
]}

机构：

[1] Univ Penn, Philadelphia, PA 19104 USA

[2] Dartmouth Coll, Hanover, NH 03755 USA

[3] UC Berkeley ICSI, Berkeley, CA USA

来源：

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年

关键词：

D O I：

10.1109/CVPR.2017.650

中图分类号：

TP18 [人工智能理论];

学科分类号：

140502 [人工智能];

摘要：

Most current semantic segmentation methods rely on fully convolutional networks (FCNs). However, their use of large receptive fields and many pooling layers cause low spatial resolution inside the deep layers. This leads to predictions with poor localization around the boundaries. Prior work has attempted to address this issue by post-processing predictions with CRFs or MRFs. But such models often fail to capture semantic relationships between objects, which causes spatially disjoint predictions. To overcome these problems, recent methods integrated CRFs or MRFs into an FCN framework. The downside of these new models is that they have much higher complexity than traditional FCNs, which renders training and testing more challenging. In this work we introduce a simple, yet effective Convolutional Random Walk Network (RWN) that addresses the issues of poor boundary localization and spatially fragmented predictions with very little increase in model complexity. Our proposed RWN jointly optimizes the objectives of pixelwise affinity and semantic segmentation. It combines these two objectives via a novel random walk layer that enforces consistent spatial grouping in the deep layers of the network. Our RWN is implemented using standard convolution and matrix multiplication. This allows an easy integration into existing FCN frameworks and it enables end-to-end training of the whole network via standard back-propagation. Our implementation of RWN requires just 131 additional parameters compared to the traditional FCNs, and yet it consistently produces an improvement over the FCNs on semantic segmentation and scene labeling.

引用

页码：6137 / 6145

页数：9

共 26 条

[1]

[Anonymous], 2001, 8 IEEE INT C COMPUTE, DOI [DOI 10.1109/ICCV.2001.937655, 10.1109/ICCV.2001.937655]

[2]

[Anonymous], ABS151107386 CORR

[3]

[Anonymous], ABS150401013 CORR

[4]

[Anonymous], COMP VIS ICCV 2015 I

[5]

Multiscale Combinatorial Grouping [J].

Arbelaez, Pablo ;

Pont-Tuset, Jordi ;

Barron, Jonathan T. ;

Marques, Ferran ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :328-335

[6]

Badrinarayanan V., 2015, SEGNET DEEP CONVOLUT, DOI DOI 10.1109/TPAMI.2016.2644615

[7]

Bahmani B, 2010, PROC VLDB ENDOW, V4, P173

[8]

Semantic Segmentation with Boundary Neural Fields [J].

Bertasius, Gedas ;

Shi, Jianbo ;

Torresani, Lorenzo .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3602-3610

[9]

Ce Liu, 2016, NONPARAMETRIC SCENE, P207

[10]

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

← 1 2 3 →