A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT

被引:646
作者
EPHRAIM, Y [1 ]
VANTREES, HL [1 ]
机构
[1] AT&T BELL LABS,TECH STAFF,MURRAY HILL,NJ 07974
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1995年 / 3卷 / 04期
关键词
D O I
10.1109/89.397090
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A comprehensive approach for nonparametric speech enhancement is developed. The underlying principle is to decompose the vector space of the noisy signal into a signal-plus-noise subspace and a noise subspace. Enhancement is performed by removing the noise subspace and estimating the clean signal from the remaining signal subspace. The decomposition can theoretically be performed by applying the Karhunen-Loeve transform (KLT) to the noisy signal. Linear estimation of the clean signal is performed using two perceptually meaningful estimation criteria. First, signal distortion is minimized while the residual noise energy is maintained below some given threshold, This criterion results in a Wiener filter with adjustable input noise level. Second, signal distortion is minimized for a fixed spectrum of the residual noise. This criterion enables masking of the residual noise by the speech signal. It results in a filter whose structure is similar to that obtained in the first case, except that now the gain function which modifies the KLT coefficients is solely dependent on the desired spectrum of the residual noise. The popular spectral subtraction speech enhancement approach is shown to be a particular case of the proposed approach. It is proven to be a signal subspace approach which is optimal in an asymptotic (large sample) linear minimum mean square error sense, assuming the signal and noise are stationary. Our listening tests indicate that 14 out of 16 listeners strongly preferred the proposed approach over the spectral subtraction approach.
引用
收藏
页码:251 / 266
页数:16
相关论文
共 44 条
[1]  
Lim J.S., Speech Enhancement. Englewood Cliffs, (1983)
[2]  
Lim J.S., Oppenheim A.V., Enhancement and bandwidth compression of noisy speech, Proc. IEEE, 67, 12, pp. 1586-1604, (1979)
[3]  
Makhoul J., Et al., Removal of Noise From Noise-Degraded Speech Signals, (1989)
[4]  
Oshaughnessy D., Enhancing speech degraded by additive noise or interfering speakers, IEEE Commun. Mag., pp. 46-52, (1989)
[5]  
Boll S.F., Speech enhancement in the 1980s: Noise suppression with pattern matching, Advances in Speech Signal Processing (S. Furui and M. M. Sondhi Eds.)., (1992)
[6]  
Ephraim Y., Statistical model based speech enhancement systems, Proc. IEEE, 80, 10, pp. 1526-1555, (1992)
[7]  
Tribolet J.M., Crochiere R.E., Frequency domain coding of speech, IEEE Trans. Acoust., ASSP-27, pp. 512-530, (1979)
[8]  
Flanagan J.L., Parametric coding of speech spectra, J. Acoust. Soc. Amer., 68, 2, pp. 412-419, (1980)
[9]  
Quatieri T.F., McAulay R.J., Phase coherence in speech reconstruction for enhancement and coding applications, Proc. IEEE Int. Conf. Acoust., pp. 207-210, (1989)
[10]  
Noise reduction using a soft-decision sine-wave vector quantizer, Proc. IEEE Int. Conf. Acoust., pp. 821-824, (1990)