Compact Bilinear Pooling

被引:587
作者
Gao, Yang [1 ]
Beijbom, Oscar [1 ]
Zhang, Ning [2 ]
Darrell, Trevor [1 ]
机构
[1] Univ Calif Berkeley, EECS, Berkeley, CA 94720 USA
[2] Snapchat Inc, Los Angeles, CA USA
来源
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR.2016.41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to a few million, which makes them impractical for subsequent analysis. We propose two compact bilinear representations with the same discriminative power as the full bilinear representation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods. Experimentation illustrate the utility of the proposed representations for image classification and few-shot learning across several datasets.
引用
收藏
页码:317 / 326
页数:10
相关论文
共 39 条
[1]  
[Anonymous], 2014, ARXIV14127149
[2]  
[Anonymous], ARXIV150601342
[3]  
[Anonymous], 2012, P 15 INT C ARTIFICIA
[4]  
[Anonymous], 2013, CoRR
[5]  
[Anonymous], ARXIV150702620
[6]  
Beijbom O, 2012, PROC CVPR IEEE, P1170, DOI 10.1109/CVPR.2012.6247798
[7]   Efficient Large-Scale Structured Learning [J].
Branson, Steve ;
Beijbom, Oscar ;
Belongie, Serge .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1806-1813
[8]  
Carreira J, 2012, LECT NOTES COMPUT SC, V7578, P430, DOI 10.1007/978-3-642-33786-4_32
[9]  
Charikar M, 2002, LECT NOTES COMPUT SC, V2380, P693
[10]   The devil is in the details: an evaluation of recent feature encoding methods [J].
Chatfield, Ken ;
Lempitsky, Victor ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,