A glimpsing model of speech perception in noise

被引:560
作者
Cooke, M [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
关键词
D O I
10.1121/1.2166600
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Do listeners process noisy speech by taking advantage of "glimpses"-spectrotemporal regions in which the target signal is least affected by the background? This study used an automatic speech recognition system, adapted for use with partially specified inputs, to identify consonants in noise. Twelve masking conditions were chosen to create a range of glimpse sizes. Several different glimpsing models were employed, differing in the local signal-to-noise ratio (SNR) used for detection, the minimum glimpse size, and the use of information in the masked regions. Recognition results were compared with behavioral data. A quantitative analysis demonstrated that the proportion of the time-frequency plane glimpsed is a good predictor of intelligibility. Recognition scores in each noise condition confirmed that sufficient information exists in glimpses to support consonant identification. Close fits to listeners' performance were obtained at two local SNR thresholds: one at around 8 dB and another in the range -5 to -2 dB. A transmitted information analysis revealed that cues to voicing are degraded more in the model than in human auditory processing. (c) 2006 Acoustical Society of America.
引用
收藏
页码:1562 / 1573
页数:12
相关论文
共 66 条
[1]   RECOGNITION OF PLOSIVE SYLLABLES IN NOISE - COMPARISON OF AN AUDITORY MODEL WITH HUMAN-PERFORMANCE [J].
AINSWORTH, WA ;
MEYER, GF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 96 (02) :687-694
[2]  
Albert S. Bregman, 1990, AUDITORY SCENE ANAL, P411, DOI [DOI 10.1121/1.408434, DOI 10.7551/MITPRESS/1486.001.0001]
[3]   Speech-in-noise perception in high-functioning individuals with autism or Asperger's syndrome [J].
Alcántara, JI ;
Weisblatt, EJL ;
Moore, BCJ ;
Bolton, PF .
JOURNAL OF CHILD PSYCHOLOGY AND PSYCHIATRY, 2004, 45 (06) :1107-1114
[4]  
[Anonymous], 1988, 2341 MRC APPL PSYCH
[5]  
*ANSI, 1997, S351997 ANSIASA
[6]  
Assmann P. F., 2004, SPEECH PROCESSING AU, V18
[7]   THE CONTRIBUTION OF WAVE-FORM INTERACTIONS TO THE PERCEPTION OF CONCURRENT VOWELS [J].
ASSMANN, PF ;
SUMMERFIELD, Q .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (01) :471-484
[8]   Modeling the perception of concurrent vowels: Role of formant transitions [J].
Assmann, PF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (02) :1141-1152
[9]  
BARKER J, 1997, P EUR 97, P2127
[10]   Decoding speech in the presence of other sources [J].
Barker, JP ;
Cooke, MP ;
Ellis, DPW .
SPEECH COMMUNICATION, 2005, 45 (01) :5-25