Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks

被引:206
作者
Chen, Long [1 ]
Zhang, Hanwang [2 ,3 ]
Xiao, Jun [1 ]
Liu, Wei [3 ]
Chang, Shih-Fu [4 ]
机构
[1] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
[2] Nanyang Technol Univ, Singapore, Singapore
[3] Tencent AI Lab, Bellevue, WA USA
[4] Columbia Univ, New York, NY 10027 USA
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
基金
中国国家自然科学基金; 浙江省自然科学基金;
关键词
D O I
10.1109/CVPR.2018.00115
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel framework called SemanticsPreserving Adversarial Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test images and their classes are both unseen during training. SP-AEN aims to tackle the inherent problem - semantic loss in the prevailing family of embedding-based ZSL, where some semantics would be discarded during training if they are non-discriminative for training classes, but could become critical for recognizing test classes. Specifically, SPAEN prevents the semantic loss by introducing an independent visual-to-semantic space embedder which disentangles the semantic space into two subspaces for the two arguably conflicting objectives: classification and reconstruction. Through adversarial learning of the two subspaces, SP-AEN can transfer the semantics from the reconstructive subspace to the discriminative one, accomplishing the improved zero-shot recognition of unseen classes. Comparing with prior works, SP-AEN can not only improve classification but also generate photo-realistic images, demonstrating the effectiveness of semantic preservation. On four popular benchmarks: CUB, AWA, SUN and aPY, SP-AEN considerably outperforms other state-of-the-art methods by an absolute performance difference of 12.2%, 9.3%, 4.0%, and 3.6% in terms of harmonic mean values [62].
引用
收藏
页码:1043 / 1052
页数:10
相关论文
共 69 条
[1]  
Akata Z., 2013, P IEEE C COMP VIS PA
[2]  
Akata Z., 2015, CVPR
[3]  
[Anonymous], 2017, ICCV
[4]  
[Anonymous], 2016, ECCV
[5]  
[Anonymous], TPAMI
[6]  
[Anonymous], 2017, COMPUTER VISION PATT
[7]  
[Anonymous], 2015, ICML
[8]  
[Anonymous], 2010, ECCV
[9]  
[Anonymous], ICLRW
[10]  
[Anonymous], 2013, P NIPS