A minimum distortion noise reduction algorithm with multiple microphones

被引:49
作者
Chen, Jingdong [1 ]
Benesty, Jacob [2 ]
Huang, Yiteng [3 ]
机构
[1] Bell Labs, Murray Hill, NJ 07974 USA
[2] Univ Quebec, INRS EMT, Montreal, PQ H5A 1K6, Canada
[3] WeVoice Inc, Bridgewater, NJ 08807 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2008年 / 16卷 / 03期
关键词
beamforming; generalized sidelobe canceller (GSC); linearly constrained minimum variance (LCMV); microphone arrays; minimum-mean-square error (MMSE); minimum variance distortionless response (MVDR); noise reduction; speech enhancement;
D O I
10.1109/TASL.2007.914969
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The problem of noise reduction using multiple microphones has long been an active area of research. Over the past few decades, most efforts have been devoted to beamforming techniques, which aim at recovering the desired source signal from the outputs of an array of microphones. In order to work reasonably well in reverberant environments, this approach often requires such knowledge as the direction of arrival (DOA) or even the room impulse responses, which are difficult to acquire reliably in practice. In addition, beamforming has to compromise its noise reduction performance in order to achieve speech dereverberation at the same time. This paper presents a new multichannel algorithm for noise reduction, which formulates the problem as one of estimating the speech component observed at one microphone using the observations from all the available microphones. This new approach explicitly uses the idea of spatial-temporal prediction and achieves noise reduction in two steps. The first step is to determine a set of inter-sensor optimal spatial-temporal prediction transformations. These transformations are then exploited in the second step to form an optimal noise-reduction filter. In comparison with traditional beamforming techniques, this new method has many appealing properties: it does not require DOA information or any knowledge of either the reverberation condition or the channel impulse responses; the multiple microphones do not have to be arranged into a specific array geometry; it works the same for both the far-field and near-field cases; and, most importantly, it can produce very good and robust noise reduction with minimum speech distortion in practical environments. Furthermore, with this new approach, it is possible to apply postprocessing filtering for additional noise reduction when a specified level of speech distortion is allowed.
引用
收藏
页码:481 / 493
页数:13
相关论文
共 43 条
[1]  
[Anonymous], 1993, FUNDAMENTAL SPEECH R
[2]  
[Anonymous], 1994, P WALL CLEM SAB CENT
[3]   On microphone-array beamforming from a MIMO acoustic signal processing perspective [J].
Benesty, Jacob ;
Chen, Jingdong ;
Huang, Yiteng ;
Dmochowski, Jacek .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03) :1053-1065
[4]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[5]  
Brandstein M., 2001, MICROPHONE ARRAYS SI
[6]   BROAD-BAND BEAMFORMING AND THE GENERALIZED SIDELOBE CANCELER [J].
BUCKLEY, KM .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (05) :1322-1323
[7]   HIGH-RESOLUTION FREQUENCY-WAVENUMBER SPECTRUM ANALYSIS [J].
CAPON, J .
PROCEEDINGS OF THE IEEE, 1969, 57 (08) :1408-&
[8]   New insights into the noise reduction Wiener filter [J].
Chen, Jingdong ;
Benesty, Jacob ;
Huang, Yiteng ;
Doclo, Simon .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04) :1218-1234
[9]   Relative transfer function identification using speech seals [J].
Cohen, I .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (05) :451-459
[10]   Identification of speech source coupling between sensors in reverberant noisy environments [J].
Cohen, I .
IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (07) :613-616