On the upper cutoff frequency of the auditory critical-band envelope detectors in the context of speech perception

被引:115
作者
Ghitza, O [1 ]
机构
[1] Agere Syst, Media Signal Proc Res, Murray Hill, NJ 07974 USA
[2] Bell Labs, Lucent Technol, Murray Hill, NJ USA
关键词
D O I
10.1121/1.1396325
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Studies in neurophysiology and in psychophysics provide evidence for the existence of temporal integration mechanisms in the auditory system. These auditory mechanisms may be viewed as "detectors," parametrized by their cutoff frequencies. There is an interest in quantifying those cutoff frequencies by direct psychophysical measurement, in particular for tasks that are related to speech perception, In this study, the inherent difficulties in synthesizing speech signals with prescribed temporal envelope bandwidth at the output of the listener cochlea have been identified. In order to circumvent these difficulties. a dichotic synthesis technique is suggested with interleaving critical-band envelopes. This technique is capable of producing signals which generate cochlear temporal envelopes with prescribed bandwidth. Moreover, for unsmoothed envelopes, the synthetic signal is perceptually indistinguishable from the original. With this technique established, psychophysical experiments have been conducted to quantify the upper cutoff frequency of the auditory critical-band envelope detectors at threshold, using high-quality, wideband speech signals (bandwidth of 7 kHz) as test stimuli. These experiments show that in order to preserve speech quality (i.e., for inaudible distortions), the minimum bandwidth of the envelope information for a given auditory channel is considerably smaller than a critical-band bandwidth (roughly one-half of one critical band). Difficulties encountered in using the dichotic synthesis technique to measure the cutoff frequencies relevant to intelligibility of speech signals with fair quality levels (e.g., above MOS level 3) are also discussed. (C) 2001 Acoustical Society of America.
引用
收藏
页码:1628 / 1640
页数:13
相关论文
共 19 条
[1]   Spectro-temporal modulation transfer functions and speech intelligibility [J].
Chi, TS ;
Gao, YJ ;
Guyton, MC ;
Ru, PW ;
Shamma, S .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (05) :2719-2732
[2]   Intrinsic envelope fluctuations and modulation-detection thresholds for narrow-band noise carriers [J].
Dau, T ;
Verhey, J ;
Kohlrausch, A .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (05) :2752-2760
[3]   Modeling auditory processing of amplitude modulation .2. Spectral and temporal integration [J].
Dau, T ;
Kollmeier, B ;
Kohlrausch, A .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 102 (05) :2906-2919
[4]   Modeling auditory processing of amplitude modulation .1. Detection and masking with narrow-band carriers [J].
Dau, T ;
Kollmeier, B ;
Kohlrausch, A .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 102 (05) :2892-2905
[5]   EFFECT OF TEMPORAL ENVELOPE SMEARING ON SPEECH RECEPTION [J].
DRULLMAN, R ;
FESTEN, JM ;
PLOMP, R .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (02) :1053-1064
[6]  
Durlach N. I., 1978, HDB PERCEPTION, P365, DOI DOI 10.1016/B978-0-12-161904-6.50017-8
[7]  
Eddins David A., 1995, P207, DOI 10.1016/B978-012505626-7/50008-X
[8]   PARAMETRIC CODING OF SPEECH SPECTRA [J].
FLANAGAN, JL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1980, 68 (02) :412-419
[9]   On the perceptual distance between speech segments [J].
Ghitza, O ;
Sondhi, MM .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 101 (01) :522-529
[10]   Dichotic presentation of interleaving critical-band envelopes: An application to multi-descriptive coding [J].
Ghitza, O ;
Kroon, P .
2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS: MEETING THE CHALLENGES OF THE NEW MILLENNIUM, 2000, :72-74