Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors

被引:205
作者
Erkelens, Jan S. [1 ]
Hendriks, Richard C.
Heusdens, Richard
Jensen, Jesper
机构
[1] Delft Univ Technol, Informat & Commun Theory Grp, NL-2628 CD Delft, Netherlands
[2] Delft Univ Technol, Dept Mediamat, NL-2628 CD Delft, Netherlands
[3] Oticon A S, DK-2765 Smorum, Denmark
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 06期
关键词
discrete fourier transform (DFT)-based speech enhancement; generalized gamima speech priors; minimum mean-square error (MMSE) estimation; SPEECH ENHANCEMENT; NOISE; SUPPRESSION;
D O I
10.1109/TASL.2007.899233
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper considers techniques for single-channel speech enhancement based on the discrete Fourier transform (DFT). Specifically, we derive minimum mean-square error (MMSE) estimators of speech DFT coefficient magnitudes as well as of complex-valued DFT coefficients based on two classes of generalized gamma distributions, under an additive Gaussian noise assumption. The resulting generalized DFT magnitude estimator has as a special case the existing scheme based on a Rayleigh speech prior, while the complex DFT estimators generalize existing schemes based on Gaussian, Laplacian, and Gamma speech priors. Extensive simulation experiments with speech signals degraded by various additive noise sources verify that significant improvements are possible with the more recent estimators based on super-Gaussian priors. The increase in perceptual evaluation of speech quality (PESQ) over the noisy signals is about 0.5 points for street noise and about 1 point for white noise, nearly independent of input signal-to-noise ratio (SNR). The assumptions made for deriving the complex DFT estimators are less accurate than those for the magnitude estimators, leading to a higher maximum achievable speech quality with the magnitude estimators.
引用
收藏
页码:1741 / 1752
页数:12
相关论文
共 32 条
  • [1] Abramowitz M., 1994, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables
  • [2] Andrianakis I, 2006, P INT C AC SPEECH SI, V3, P1068, DOI DOI 10.1109/ICASSP.2006.1660842
  • [3] [Anonymous], 2005, Speech Enhancement
  • [4] [Anonymous], NOIZEUS NOISY SPEECH
  • [5] BEERENDS JG, 2004, PESQ ASSESSING SPEEC, P862
  • [6] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
    BOLL, SF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
  • [7] Relaxed statistical model for speech enhancement and a priori SNR estimation
    Cohen, I
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 870 - 881
  • [8] DAT TH, 2005, P 30 IEEE INT C AC S, V4, P181
  • [9] Deller JR, 2000, DISCRETE TIME PROCES, DOI DOI 10.1109/9780470544402.CH11
  • [10] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 443 - 445