Compact Bilinear Pooling

被引：587

作者：

Gao, Yang ^{[1
]}

Beijbom, Oscar ^{[1
]}

Zhang, Ning ^{[2
]}

Darrell, Trevor ^{[1
]}

机构：

[1] Univ Calif Berkeley, EECS, Berkeley, CA 94720 USA

[2] Snapchat Inc, Los Angeles, CA USA

来源：

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR.2016.41

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to a few million, which makes them impractical for subsequent analysis. We propose two compact bilinear representations with the same discriminative power as the full bilinear representation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods. Experimentation illustrate the utility of the proposed representations for image classification and few-shot learning across several datasets.

引用

页码：317 / 326

页数：10

共 39 条

[1]

[Anonymous], 2014, ARXIV14127149

[2]

[Anonymous], ARXIV150601342

[3]

[Anonymous], 2012, P 15 INT C ARTIFICIA

[4]

[Anonymous], 2013, CoRR

[5]

[Anonymous], ARXIV150702620

[6]

Beijbom O, 2012, PROC CVPR IEEE, P1170, DOI 10.1109/CVPR.2012.6247798

[7] Efficient Large-Scale Structured Learning [J].

Branson, Steve ;

Beijbom, Oscar ;

Belongie, Serge .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1806-1813

[8]

Carreira J, 2012, LECT NOTES COMPUT SC, V7578, P430, DOI 10.1007/978-3-642-33786-4_32

[9]

Charikar M, 2002, LECT NOTES COMPUT SC, V2380, P693

[10] The devil is in the details: an evaluation of recent feature encoding methods [J].

Chatfield, Ken ;

Lempitsky, Victor ;

Vedaldi, Andrea ;

Zisserman, Andrew .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,

← 1 2 3 4 →