SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON TEMPORAL CEPSTRUM SMOOTHING

被引:12
作者
Gerkmann, Timo [1 ]
Krawczyk, Martin [1 ]
Martin, Rainer [1 ]
机构
[1] Ruhr Univ Bochum, Inst Commun Acoust IKA, D-44780 Bochum, Germany
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
Speech presence probability; speech analysis; cepstral analysis; speech enhancement; smoothing methods; ENHANCEMENT;
D O I
10.1109/ICASSP.2010.5495677
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a novel, robust estimator for the probability of speech presence at each time-frequency point in the short-time discrete Fourier domain. While existing estimators perform quite reliably in stationary noise environments, they usually exhibit a large false-alarm rate in nonstationary noise that results in a great deal of noise leakage when applied to a speech enhancement task. The proposed estimator overcomes this problem by temporally smoothing the cepstrum of the a posteriori signal-to-noise ratio (SNR), and yields considerably less noise leakage and low speech distortions in both, stationary and nonstationary noise as compared to state-of-the-art estimators. Especially in babble noise, this results in large SNR improvements.
引用
收藏
页码:4254 / 4257
页数:4
相关论文
共 14 条
[1]  
Andrianakis I, 2006, P INT C AC SPEECH SI, V3, P1068, DOI DOI 10.1109/ICASSP.2006.1660842
[2]  
[Anonymous], 1988, NAT I STANDARDS THEC
[3]   A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing [J].
Breithaupt, Colin ;
Gerkmann, Timo ;
Martin, Rainer .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4897-4900
[4]   Speech enhancement for non-stationary noise environments [J].
Cohen, I ;
Berdugo, B .
SIGNAL PROCESSING, 2001, 81 (11) :2403-2418
[5]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[6]   Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors [J].
Gerkmann, Timo ;
Breithaupt, Colin ;
Martin, Rainer .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05) :910-919
[7]   On the Statistics of Spectral Amplitudes After Variance Reduction by Temporal Cepstrum Smoothing and Cepstral Nulling [J].
Gerkmann, Timo ;
Martin, Rainer .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2009, 57 (11) :4165-4174
[8]  
Gerkmann T, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2134
[9]  
Gradshteyn S., 2014, Table of Integrals, Series, and Products, V8th
[10]   Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments [J].
Malah, D ;
Cox, RV ;
Accardi, AJ .
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, :789-792