SPEECH ENHANCEMENT BASED ON PHYSIOLOGICAL AND PSYCHOACOUSTICAL MODELS OF MODULATION PERCEPTION AND BINAURAL INTERACTION

被引:74
作者
KOLLMEIER, B [1 ]
KOCH, R [1 ]
机构
[1] UNIV GOTTINGEN,DRITTES PHYS INST,D-37073 GOTTINGEN,GERMANY
关键词
D O I
10.1121/1.408546
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A novel approach for analyzing and filtering speech is described and evaluated which utilizes the ''modulation spectrogram,'' i.e., the two-dimensional representation of modulation frequencies versus center frequency as a function of time. This approach is based on physiological findings of a tonotopical organization of modulation frequencies perpendicular to carrier frequencies as well as psychoacoustical findings of ''modulation tuning curves.'' In addition, an interaction is assumed between the representation of modulation frequencies and the representation of auditory space as described by physiological and psychological models of binaural hearing. A noise-reduction algorithm based on this approach was implemented and tested which enhances or suppresses each combination of modulation frequency and center frequency according to its phase and intensity relation between the two input signals (i.e., both stereo channels of a dummy-head recording). When tested in several situations with interfering speakers and background noise both in anechoic and reverberant environment, the algorithm provided a small but a very robust increase in speech intelligibility which corresponds to approximately 2 dB in signal-to-noise ratio. Possible applications of this algorithm are noise reduction in adverse acoustical situations, digital hearing aids, processing schemes and preprocessing for speech recognition.
引用
收藏
页码:1593 / 1602
页数:10
相关论文
共 40 条
[1]  
[Anonymous], 1990, AUDITORY SCENE ANAL, DOI [DOI 10.1121/1.408434, 10.1121/1.408434]
[2]   MODULATION MASKING - EFFECTS OF MODULATION FREQUENCY, DEPTH, AND PHASE [J].
BACON, SP ;
GRANTHAM, DW .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1989, 85 (06) :2575-2580
[3]  
Bodden M., 1993, Acta Acustica, V1, P43
[4]  
CASSEDAY JH, 1987, DIRECTIONAL HEARING
[5]  
COLBURN H, 1978, HDB PERCEPTION, V4, P467
[6]  
DAMASKE P, 1969, ACUSTICA, V21, P30
[7]  
FASSEL R, 1992, FORTSCHRITTE AKUSTIK, P913
[8]  
FASSEL R, 1993, FORTSCHRITTE AKUSTIK, P812
[9]  
Graupe D, 1987, J Rehabil Res Dev, V24, P119
[10]   OSCILLATORY RESPONSES IN CAT VISUAL-CORTEX EXHIBIT INTER-COLUMNAR SYNCHRONIZATION WHICH REFLECTS GLOBAL STIMULUS PROPERTIES [J].
GRAY, CM ;
KONIG, P ;
ENGEL, AK ;
SINGER, W .
NATURE, 1989, 338 (6213) :334-337