Time-domain noise reduction based on an orthogonal decomposition for desired signal extraction

被引：16

作者：

Benesty, Jacob ^{[1
]}

Chen, Jingdong ^{[2
]}

Huang, Yiteng ^{[3
]}

Gaensler, Tomas ^{[4
]}

机构：

[1] Univ Quebec, INRS EMT, Montreal, PQ H5A 1K6, Canada

[2] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China

[3] WeVoice Inc, Bridgewater, NJ 08807 USA

[4] Mh Acoust LLC, Summit, NJ 07901 USA

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2012年 / 132卷 / 01期

关键词：

SPECTRAL DENSITY; SUPPRESSION;

D O I：

10.1121/1.4726071

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses the problem of noise reduction in the time domain where the clean speech sample at every time instant is estimated by filtering a vector of the noisy speech signal. Such a clean speech estimate consists of both the filtered speech and residual noise (filtered noise) as the noisy vector is the sum of the clean speech and noise vectors. Traditionally, the filtered speech is treated as the desired signal after noise reduction. This paper proposes to decompose the clean speech vector into two orthogonal components: one is correlated and the other is uncorrelated with the current clean speech sample. While the correlated component helps estimate the clean speech, it is shown that the uncorrelated component interferes with the estimation, just as the additive noise. Based on this orthogonal decomposition, the paper presents a way to define the error signal and cost functions and addresses the issue of how to design different optimal noise reduction filters by optimizing these cost functions. Specifically, it discusses how to design the maximum SNR filter, the Wiener filter, the minimum variance distortionless response (MVDR) filter, the tradeoff filter, and the linearly constrained minimum variance (LCMV) filter. It demonstrates that the maximum SNR, Wiener, MVDR, and tradeoff filters are identical up to a scaling factor. It also shows from the orthogonal decomposition that many performance measures can be defined, which seem to be more appropriate than the traditional ones for the evaluation of the noise reduction filters. (C) 2012 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4726071]

引用

页码：452 / 464

页数：13

共 18 条

[1] Avargel Y, 2010, SPRINGER TOP SIGN PR, V3, P1
[2] Benesty J, 2005, SIG COM TEC, P9, DOI 10.1007/3-540-27489-8_2
[3] Benesty J, 2009, SPRINGER TOP SIGN PR, V2, P1, DOI 10.1007/978-3-642-00296-0_1
[4] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
[5] HIGH-RESOLUTION FREQUENCY-WAVENUMBER SPECTRUM ANALYSIS
CAPON, J
[J]. PROCEEDINGS OF THE IEEE, 1969, 57 (08) : 1408 - &
[6] Enhanced Itakura measure incorporating masking properties of human auditory system
Chen, G
Koh, SN
Soon, IY
[J]. SIGNAL PROCESSING, 2003, 83 (07) : 1445 - 1456
[7] New insights into the noise reduction Wiener filter
Chen, Jingdong
Benesty, Jacob
Huang, Yiteng
Doclo, Simon
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1218 - 1234
[8] Study of the Noise-Reduction Problem in the Karhunen-Loeve Expansion Domain
Chen, Jingdong
Benesty, Jacob
Huang, Yiteng
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 787 - 802
[9] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR
EPHRAIM, Y
MALAH, D
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06): : 1109 - 1121
[10] ER MH, 1983, IEEE T ACOUST SPEECH, V31, P1378

← 1 2 →