Multi-Label Image Recognition with Graph Convolutional Networks

被引:850
作者
Chen, Zhao-Min [1 ,2 ]
Wei, Xiu-Shen [2 ]
Wang, Peng [3 ]
Guo, Yanwen [1 ,4 ,5 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Megvii Technol, Megvii Res Nanjing, Beijing, Peoples R China
[3] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia
[4] Sci & Technol Informat Syst Engn Lab, Changsha, Peoples R China
[5] 28th Res Inst China Elect Technol Grp Corp, Nanjing 210007, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR.2019.00532
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore,we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.
引用
收藏
页码:5172 / 5181
页数:10
相关论文
共 36 条
[1]  
[Anonymous], 2016, CoRR abs/1512.00567, DOI DOI 10.1109/CVPR.2016.308
[2]  
[Anonymous], 2016, MULTIMEDIA TOOLS APP, DOI DOI 10.1186/s13041-016-0193-7
[3]  
[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.37
[4]  
Chen SF, 2018, AAAI CONF ARTIF INTE, P6714
[5]  
Chen TS, 2018, AAAI CONF ARTIF INTE, P6730
[6]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136
[9]   Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based onWeakly Supervised Learning [J].
Ge, Weifeng ;
Yang, Sibei ;
Yu, Yizhou .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1277-1286
[10]  
Ge Zongyuan, 2018, ARXIV180707247