Noise tracking using DFT domain subspace decompositions

被引:45
作者
Hendriks, Richard C. [1 ]
Jensen, Jesper [2 ]
Heusdens, Richard [1 ]
机构
[1] Delft Univ Technol, Dept Mediamat, NL-2628 CD Delft, Netherlands
[2] Oticon AS, DK-2765 Smorum, Denmark
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2008年 / 16卷 / 03期
关键词
discrete Fourier transform (DFT) domain subspace decompositions; noise tracking; speech enhancement;
D O I
10.1109/TASL.2007.914977
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
All discrete Fourier transform (DFT) domain-based speech enhancement gain functions rely on knowledge of the noise power spectral density (PSD). Since the noise PSD is unknown in advance, estimation from the noisy speech signal is necessary. An overestimation of the noise PSD will lead to a loss in speech quality, while an underestimation will lead to an unnecessary high level of residual noise. We present a novel approach for noise tracking, which updates the noise PSD for each DFT coefficient in the presence of both speech and noise. This method is based on the eigenvalue decomposition of correlation matrices that are constructed from time series of noisy DFT coefficients. The presented method is very well capable of tracking gradually changing noise types. In comparison to state-of-the-art noise tracking algorithms the proposed method reduces the estimation error between the estimated and the true noise PSD. In combination with an enhancement system the proposed method improves the segmental SNR with several decibels for gradually changing noise types. Listening experiments show that the proposed system is preferred over the state-of-the-art noise tracking algorithm.
引用
收藏
页码:541 / 553
页数:13
相关论文
共 29 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], 1998, FUNDEMENTALS STAT SI
[3]  
[Anonymous], 1992, DISCRETE RANDOM SIGN
[4]  
Brillinger David R, 2001, Time series: data analysis and theory
[5]  
BUHNJUN V, 2004, P INT S INT MUL VID, P1
[6]  
BUHNJUN V, 2006, INT WORKSH ACOUST EC, P1
[7]   Voice activity detection based on multiple statistical models [J].
Chang, Joon-Hyuk ;
Kim, Nam Soo ;
Mitra, Sanjit K. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (06) :1965-1976
[8]   Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging [J].
Cohen, I .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05) :466-475
[9]   Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models [J].
Cohen, I .
SIGNAL PROCESSING, 2006, 86 (04) :698-709
[10]   A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT [J].
EPHRAIM, Y ;
VANTREES, HL .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (04) :251-266