Multiscale hybrid linear models for lossy image representation

被引:137
作者
Hong, Wei [1 ]
Wright, John
Huang, Kun
Ma, Yi
机构
[1] Texas Instruments Inc, DSP Solut Res & Dev Ctr, Dallas, TX 75243 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
[3] Ohio State Univ, Dept Biomed Informat, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
generalized principal component analysis; hybrid linear model; image representation; wavelets;
D O I
10.1109/TIP.2006.882016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce a simple and efficient representation for natural images. We view an image (in either the spatial domain or the wavelet domain) as a collection of vectors in a high-dimensional space. We then fit a piece-wise linear model (i.e., a union of affine subspaces) to the vectors at each downisampling scale. We call this a multiscale hybrid linear model for the image. The model can be effectively estimated via a new algebraic method known as generalized principal component analysis (GPCA). The hybrid and hierarchical structure of this model allows us to effectively extract and exploit multimodal correlations among the imagery data at different scales. It conceptually and computationally remedies limitations of many existing image representation methods that are based on either a fixed linear transformation (e.g., DCT, wavelets), or an adaptive uni-modal linear transformation (e.g., PCA), or a multimodal model that uses only cluster means (e.g., VQ). We will justify both quantitatively and experimentally why and how such a simple multiscale hybrid model is able to reduce simultaneously the model complexity and computational cost. Despite a small overhead of the model, our careful and extensive experimental results show that this new model gives more compact representations for a wide variety of natural images under a wide range of signal-to-noise ratios than many existing methods, including wavelets. We also briefly address how the same (hybrid linear) modeling paradigm can be extended to be potentially useful for other applications, such as image segmentation.
引用
收藏
页码:3655 / 3671
页数:17
相关论文
共 41 条
[1]  
[Anonymous], 1999, WAVELET TOUR SIGNAL
[2]  
BJORNER A, 1994, PROG MATH, V119, P321
[3]  
BJORNER A, 2003, J LOND MATH SOC
[4]   THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].
BURT, PJ ;
ADELSON, EH .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540
[5]  
Candes E.J., 2002, Technical report
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]  
DeVore R. A., 1998, Acta Numerica, V7, P51, DOI 10.1017/S0962492900002816
[8]   IMAGE COMPRESSION THROUGH WAVELET TRANSFORM CODING [J].
DEVORE, RA ;
JAWERTH, B ;
LUCIER, BJ .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (02) :719-746
[9]  
Do MN, 2002, IEEE IMAGE PROC, P357
[10]  
Donoho D, 2004, MOST LARGE UNDERDETE