Multi-Label Image Recognition with Graph Convolutional Networks

被引：850

作者：

Chen, Zhao-Min ^{[1
,2
]}

Wei, Xiu-Shen ^{[2
]}

Wang, Peng ^{[3
]}

Guo, Yanwen ^{[1
,4
,5
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Megvii Technol, Megvii Res Nanjing, Beijing, Peoples R China

[3] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia

[4] Sci & Technol Informat Syst Engn Lab, Changsha, Peoples R China

[5] 28th Res Inst China Elect Technol Grp Corp, Nanjing 210007, Peoples R China

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

10.1109/CVPR.2019.00532

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore,we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.

引用

页码：5172 / 5181

页数：10

共 36 条

[1]

[Anonymous], 2016, CoRR abs/1512.00567, DOI DOI 10.1109/CVPR.2016.308

[2]

[Anonymous], 2016, MULTIMEDIA TOOLS APP, DOI DOI 10.1186/s13041-016-0193-7

[3]

[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.37

[4]

Chen SF, 2018, AAAI CONF ARTIF INTE, P6714

[5]

Chen TS, 2018, AAAI CONF ARTIF INTE, P6730

[6] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

[7]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[8] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[9] Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based onWeakly Supervised Learning [J].

Ge, Weifeng ;

Yang, Sibei ;

Yu, Yizhou .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1277-1286

[10]

Ge Zongyuan, 2018, ARXIV180707247

← 1 2 3 4 →