Speech enhancement employing Laplacian-Gaussian mixture

被引:44
作者
Gazor, S [1 ]
Zhang, W [1 ]
机构
[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2005年 / 13卷 / 05期
关键词
adaptive Karhunen-Loeve transform; adaptive signal detection; adaptive signal processing; colored noise; decorrelated domains; decorrelation; decorrelation transformation; discrete cosine transforms; Gaussian distribution; generalized GD; Karhunen-Loeve transforms; Laplacian distribution; Laplacian-Gaussian Mixture; Laplacian random variables; linear minimum mean squared error estimation; marginal distributions; minimum mean squared error estimation; maximum likelihood estimation; multivariate distribution approximation; non-Gaussian distribution; nonlinear speech enhancement; speech activity detection; speech enhancement; speech probability distribution; speech processing; speech quality evaluation; speech samples distribution; speech signal statistics; time-varying speech components energy;
D O I
10.1109/TSA.2005.851943
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new, efficient speech enhancement algorithm (SEA) is developed in this paper. In this low-complexity SEA, a noisy speech signal is first decorrelated and then the clean speech components are estimated from the decorrelated noisy speech samples. The distributions of clean speech and noise signals are assumed to be Laplacian and Gaussian, respectively. The clean speech components are estimated either by maximum likelihood (ML) or minimum-mean-square-error (MMSE) estimators. These estimators require some statistical parameters derived from speech and noise. These parameters are adaptively extracted by the ML approach during the active speech or silence intervals, respectively. In addition, a voice activity detector (VAD) that uses the same statistical model is employed to detect whether the speech is active or not. The simulation results show that our SEA approach performs as well as a recent high efficiency SEA that employs the Wiener filter. The computational complexity of this algorithm is very low compared with existing SEAs with low computational complexity.
引用
收藏
页码:896 / 904
页数:9
相关论文
共 18 条
[1]  
BREITHAUPT C, 2003, P INT C AC SPEECH SI, V1, P896
[2]  
CHANG JH, 2004, UNPUB J SIGNAL P APR
[3]  
CHANG JH, 2002, P IEEE SPEECH COD WO
[4]   Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features [J].
Deng, L ;
Droppo, J ;
Acero, A .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (03) :218-233
[5]   Enhancement of log Mel power spectra of speech using a phase-sensitive model of the-acoustic environment and sequential estimation of the corrupting noise [J].
Deng, L ;
Droppo, J ;
Acero, A .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (02) :133-143
[6]   A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT [J].
EPHRAIM, Y ;
VANTREES, HL .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (04) :251-266
[7]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[8]   A soft voice activity detector based on a Laplacian-Gaussian model [J].
Gazor, S ;
Zhang, W .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05) :498-505
[9]   Speech probability distribution [J].
Gazor, S ;
Zhang, W .
IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (07) :204-207
[10]  
GAZOR S, 2004, P IEEE CAN C EL COMP