Time-domain noise reduction based on an orthogonal decomposition for desired signal extraction

被引:16
作者
Benesty, Jacob [1 ]
Chen, Jingdong [2 ]
Huang, Yiteng [3 ]
Gaensler, Tomas [4 ]
机构
[1] Univ Quebec, INRS EMT, Montreal, PQ H5A 1K6, Canada
[2] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China
[3] WeVoice Inc, Bridgewater, NJ 08807 USA
[4] Mh Acoust LLC, Summit, NJ 07901 USA
关键词
SPECTRAL DENSITY; SUPPRESSION;
D O I
10.1121/1.4726071
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper addresses the problem of noise reduction in the time domain where the clean speech sample at every time instant is estimated by filtering a vector of the noisy speech signal. Such a clean speech estimate consists of both the filtered speech and residual noise (filtered noise) as the noisy vector is the sum of the clean speech and noise vectors. Traditionally, the filtered speech is treated as the desired signal after noise reduction. This paper proposes to decompose the clean speech vector into two orthogonal components: one is correlated and the other is uncorrelated with the current clean speech sample. While the correlated component helps estimate the clean speech, it is shown that the uncorrelated component interferes with the estimation, just as the additive noise. Based on this orthogonal decomposition, the paper presents a way to define the error signal and cost functions and addresses the issue of how to design different optimal noise reduction filters by optimizing these cost functions. Specifically, it discusses how to design the maximum SNR filter, the Wiener filter, the minimum variance distortionless response (MVDR) filter, the tradeoff filter, and the linearly constrained minimum variance (LCMV) filter. It demonstrates that the maximum SNR, Wiener, MVDR, and tradeoff filters are identical up to a scaling factor. It also shows from the orthogonal decomposition that many performance measures can be defined, which seem to be more appropriate than the traditional ones for the evaluation of the noise reduction filters. (C) 2012 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4726071]
引用
收藏
页码:452 / 464
页数:13
相关论文
共 18 条
  • [1] Avargel Y, 2010, SPRINGER TOP SIGN PR, V3, P1
  • [2] Benesty J, 2005, SIG COM TEC, P9, DOI 10.1007/3-540-27489-8_2
  • [3] Benesty J, 2009, SPRINGER TOP SIGN PR, V2, P1, DOI 10.1007/978-3-642-00296-0_1
  • [4] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
    BOLL, SF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
  • [5] HIGH-RESOLUTION FREQUENCY-WAVENUMBER SPECTRUM ANALYSIS
    CAPON, J
    [J]. PROCEEDINGS OF THE IEEE, 1969, 57 (08) : 1408 - &
  • [6] Enhanced Itakura measure incorporating masking properties of human auditory system
    Chen, G
    Koh, SN
    Soon, IY
    [J]. SIGNAL PROCESSING, 2003, 83 (07) : 1445 - 1456
  • [7] New insights into the noise reduction Wiener filter
    Chen, Jingdong
    Benesty, Jacob
    Huang, Yiteng
    Doclo, Simon
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1218 - 1234
  • [8] Study of the Noise-Reduction Problem in the Karhunen-Loeve Expansion Domain
    Chen, Jingdong
    Benesty, Jacob
    Huang, Yiteng
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 787 - 802
  • [9] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06): : 1109 - 1121
  • [10] ER MH, 1983, IEEE T ACOUST SPEECH, V31, P1378