Supervised Learning of Quantizer Codebooks by Information Loss Minimization

被引：138

作者：

Lazebnik, Svetlana ^{[1
]}

Raginsky, Maxim ^{[2
]}

机构：

[1] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA

[2] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27708 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2009年 / 31卷 / 07期

基金：

美国国家科学基金会;

关键词：

Pattern recognition; information theory; quantization; clustering; computer vision; scene analysis; segmentation; IMAGE COMPRESSION; CLASSIFICATION;

D O I：

10.1109/TPAMI.2008.138

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a technique for jointly quantizing continuous features and the posterior distributions of their class labels based on minimizing empirical information loss such that the quantizer index of a given feature vector approximates a sufficient statistic for its class label. Informally, the quantized representation retains as much information as possible for classifying the feature vector correctly. We derive an alternating minimization procedure for simultaneously learning codebooks in the euclidean feature space and in the simplex of posterior class distributions. The resulting quantizer can be used to encode unlabeled points outside the training set and to predict their posterior class distributions, and has an elegant interpretation in terms of lossless source coding. The proposed method is validated on synthetic and real data sets and is applied to two diverse problems: learning discriminative visual vocabularies for bag-of-features image classification and image segmentation.

引用

页码：1294 / 1309

页数：16

共 43 条

[1] Lloyd clustering of Gauss mixture models for image compression and classification [J].

Aiyer, A ;

Pyun, KP ;

Huang, YZ ;

O'Brien, DB ;

Gray, RM .

SIGNAL PROCESSING-IMAGE COMMUNICATION, 2005, 20 (05) :459-485

[2]

[Anonymous], TKKFA601 HELS I TECH

[3]

[Anonymous], 2004, P ECCV WORKSH STAT L

[4]

[Anonymous], 2001, The Bayesian choice

[5]

[Anonymous], 2013, A Probabilistic Theory of Pattern Recognition

[6]

Banerjee A, 2005, J MACH LEARN RES, V6, P1705

[7]

Berger T, 1971, Rate Distortion Theory. A Mathematical Basis for Data Compression

[8]

Bishop Christopher M, 1995, Neural networks for pattern recognition

[9]

Blackwell D, 1954, Theory of Games and Statistical Decisions

[10]

Bregman L. M., 1967, USSR Comput. Math. Math. Phys., V7, P200, DOI [10.1016/0041-5553(67)90040-7, DOI 10.1016/0041-5553(67)90040-7]

← 1 2 3 4 5 →