Multi-Label Image Recognition with Graph Convolutional Networks

被引:850
作者
Chen, Zhao-Min [1 ,2 ]
Wei, Xiu-Shen [2 ]
Wang, Peng [3 ]
Guo, Yanwen [1 ,4 ,5 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Megvii Technol, Megvii Res Nanjing, Beijing, Peoples R China
[3] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia
[4] Sci & Technol Informat Syst Engn Lab, Changsha, Peoples R China
[5] 28th Res Inst China Elect Technol Grp Corp, Nanjing 210007, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR.2019.00532
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore,we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.
引用
收藏
页码:5172 / 5181
页数:10
相关论文
共 36 条
[31]   Multi-label Image Recognition by Recurrently Discovering Attentional Regions [J].
Wang, Zhouxia ;
Chen, Tianshui ;
Li, Guanbin ;
Xu, Ruijia ;
Lin, Liang .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :464-472
[32]  
Wei X.-S., 2019, Rpc: a large-scale retail product checkout dataset
[33]   HCP: A Flexible CNN Framework for Multi-Label Image Classification [J].
Wei, Yunchao ;
Xia, Wei ;
Lin, Min ;
Huang, Junshi ;
Ni, Bingbing ;
Dong, Jian ;
Zhao, Yao ;
Yan, Shuicheng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (09) :1901-1907
[34]   Aggregated Residual Transformations for Deep Neural Networks [J].
Xie, Saining ;
Girshick, Ross ;
Dollar, Piotr ;
Tu, Zhuowen ;
He, Kaiming .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5987-5995
[35]   ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [J].
Zhang, Xiangyu ;
Zhou, Xinyu ;
Lin, Mengxiao ;
Sun, Ran .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6848-6856
[36]   Learning Spatial Regularization with Image-level Supervisions for Multi-label Image Classification [J].
Zhu, Feng ;
Li, Hongsheng ;
Ouyang, Wanli ;
Yu, Nenghai ;
Wang, Xiaogang .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2027-2036