Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade

被引：150

作者：

Li, Xiaoxiao ^{[1
]}

Liu, Ziwei ^{[1
]}

Luo, Ping ^{[1
,2
]}

Loy, Chen Change ^{[1
,2
]}

Tang, Xiaoou ^{[1
,2
]}

机构：

[1] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Hong Kong, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen Key Lab Comp Vis & Pat Rec, Beijing, Peoples R China

来源：

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR.2017.684

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a novel deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation. Unlike the conventional model cascade (MC) that is composed of multiple independent models, LC treats a single deep model as a cascade of several sub-models. Earlier sub-models are trained to handle easy and confident regions, and they progressively feed-forward harder regions to the next sub-model for processing. Convolutions are only calculated on these regions to reduce computations. The proposed method possesses several advantages. First, LC classifies most of the easy regions in the shallow stage and makes deeper stage focuses on a few hard regions. Such an adaptive and 'difficulty-aware' learning improves segmentation performance. Second, LC accelerates both training and testing of deep network thanks to early decisions in the shallow stage. Third, in comparison to MC, LC is an end-to-end trainable framework, allowing joint learning of all sub-models. We evaluate our method on PASCAL VOC and Cityscapes datasets, achieving state-of-the-art performance and fast speed.

引用

页码：6459 / 6468

页数：10

共 39 条

[1]

[Anonymous], 2016, AQUACULT RES, DOI DOI 10.1007/S11200-014-0975-2

[2]

[Anonymous], 2016, ARXIV160404339

[3]

[Anonymous], ARXIV160607230

[4]

Badrinarayanan V., 2015, SEGNET DEEP CONVOLUT, DOI DOI 10.1109/TPAMI.2016.2644615

[5] Learning Complexity-Aware Cascades for Deep Pedestrian Detection [J].

Cai, Zhaowei ;

Saberian, Mohammad ;

Vasconcelos, Nuno .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3361-3369

[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[7] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[8]

Dai J., 2015, ARXIV150301640V2

[9]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[10] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

← 1 2 3 4 →