PDF optimized parametric vector quantization of speech line spectral frequencies

被引:79
作者
Subramaniam, AD [1 ]
Rao, BD [1 ]
机构
[1] Univ Calif San Diego, Dept Elect & Comp Engn, San Diego, CA 92122 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2003年 / 11卷 / 02期
关键词
Gaussian mixture models; source coding; speech coding; transform coding;
D O I
10.1109/TSA.2003.809192
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A computationally efficient, high quality, vector quantization scheme based on a parametric probability density function (PDF) is proposed. In this scheme, the observations are modeled as i.i.d realizations of a multivariate Gaussian mixture density. The mixture model parameters are efficiently estimated using the Expectation Maximization (EM) algorithm. A low complexity quantization scheme using transform coding and bit allocation techniques which allows for easy mapping from observation to quantized value is developed for both fixed rate and variable rate systems. An attractive feature of this method is that source encoding using the resultant codebook involves very few searches and its computational complexity is minimal and independent of the rate of the system. Furthermore, the proposed scheme is bit scalable and can switch seamlessly between a memoryless quantizer and a quantizer with memory. The usefulness of the approach is demonstrated for speech coding where Gaussian mixture models are used to model speech line spectral frequencies. The performance of the memoryless quantizer is 1-3 bits better than conventional quantization schemes.
引用
收藏
页码:130 / 142
页数:13
相关论文
共 43 条
[1]  
[Anonymous], 1992, MULTIVARIATE DENSITY
[2]   Advances in residual vector quantization: A review [J].
Barnes, CF ;
Rizvi, SA ;
Nasrabadi, NM .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1996, 5 (02) :226-262
[3]   SPECTRA OF QUANTIZED SIGNALS [J].
BENNETT, WR .
BELL SYSTEM TECHNICAL JOURNAL, 1948, 27 (03) :446-472
[4]   ENHANCED MULTISTAGE VECTOR QUANTIZATION BY JOINT CODEBOOK DESIGN [J].
CHAN, WY ;
GUPTA, S ;
GERSHO, A .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1992, 40 (11) :1693-1697
[5]   Lattice vector quantization of generalized Gaussian sources [J].
Chen, F ;
Gao, Z ;
Villasenor, J .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1997, 43 (01) :92-103
[6]   REAL-TIME SIMULATION OF ADAPTIVE TRANSFORM CODING [J].
COX, RV ;
CROCHIERE, RE .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1981, 29 (02) :147-154
[7]   UNIVERSAL NOISELESS CODING [J].
DAVISSON, LD .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1973, 19 (06) :783-795
[8]  
Duda R.O., 2001, PATTERN CLASSIFICATI
[9]  
GAO Z, 1995, IEEE SIGNAL PROC LET, V2, P197
[10]   THEORETICAL-ANALYSIS OF THE HIGH-RATE VECTOR QUANTIZATION OF LPC PARAMETERS [J].
GARDNER, WR ;
RAO, BD .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (05) :367-381