The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise

被引:133
作者
Lu, Youyi [1 ]
Cooke, Martin [2 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Basque Country, Fac Letras, Language & Speech Lab, Vitoria, Spain
关键词
Intelligibility; Noise; Speech production; Spectral tilt; FUNDAMENTAL-FREQUENCY; SPEAKER INTELLIGIBILITY; NORMAL-HEARING; CLEAR SPEECH; RECOGNITION; LISTENERS; ENHANCEMENT; PERCEPTION; ENVIRONMENTS; CHILDREN;
D O I
10.1016/j.specom.2009.07.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Talkers modify the way they speak in the presence of noise. As well as increases in voice level and fundamental frequency (170), a flattening of spectral tilt is observed. The resulting "Lombard speech" is typically more intelligible than speech produced in quiet, even when level differences are removed. What is the cause of the enhanced intelligibility of Lombard speech? The current study explored the relative contributions to intelligibility of changes in mean F0 and spectral tilt. The roles of F0 and spectral tilt were assessed by measuring the intelligibility gain of non-Lombard speech whose mean F0 and spectrum were manipulated, both independently and in concert, to simulate those of natural Lombard speech. In the presence of speech-shaped noise, flattening of spectral tilt contributed greatly to the intelligibility gain of noise-induced speech over speech produced in quiet while an increase in F0 did not have a significant influence. The perceptual effects of spectrum flattening was attributed to its ability of increasing the amount of speech time-frequency plane "glimpsed" in the presence of noise. However, spectral tilt changes alone could not fully account for the intelligibility of Lombard speech. Other changes observed in Lombard speech such as durational modifications may well contribute to intelligibility. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:1253 / 1262
页数:10
相关论文
共 42 条
[1]   Identification of frequency-shifted vowels [J].
Assmann, Peter F. ;
Nearey, Terrance M. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (05) :3203-3212
[2]   Synthesis fidelity and time-varying spectral change in vowels [J].
Assmann, PF ;
Katz, WF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 117 (02) :886-895
[3]  
ASSMANN PF, 2002, P 7 INT C SPOK LANG, P425
[4]   Modelling speaker intelligibility in noise [J].
Barker, Jon ;
Cooke, Martin .
SPEECH COMMUNICATION, 2007, 49 (05) :402-417
[5]  
Boersma P., 1993, Institute of Phonetic Sciences, University of Amsterdam, Proceedings 17 (1993) 97-110, P97
[6]   A NOTE ON THE ACOUSTIC-PHONETIC CHARACTERISTICS OF INADVERTENTLY CLEAR SPEECH [J].
BOND, ZS ;
MOORE, TJ .
SPEECH COMMUNICATION, 1994, 14 (04) :325-337
[7]   A glimpsing model of speech perception in noise [J].
Cooke, M .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 119 (03) :1562-1573
[8]   An audio-visual corpus for speech perception and automatic speech recognition (L) [J].
Cooke, Martin ;
Barker, Jon ;
Cunningham, Stuart ;
Shao, Xu .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (05) :2421-2424
[9]   INTELLIGIBILITY OF AVERAGE TALKERS IN TYPICAL LISTENING ENVIRONMENTS [J].
COX, RM ;
ALEXANDER, GC ;
GILMORE, C .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1987, 81 (05) :1598-1608
[10]   EFFECTS OF AMBIENT NOISE ON SPEAKER INTELLIGIBILITY FOR WORDS AND PHRASES [J].
DREHER, JJ ;
ONEILL, JJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1957, 29 (12) :1320-1323