Auditory speech detection in noise enhanced by lipreading

被引：131

作者：

Bernstein, LE

Auer, ET

Takayanagi, S

机构：

[1] House Ear Res Inst, Dept Commun Neurosci, Los Angeles, CA 90057 USA

[2] Natl Sci Fdn, Arlington, VA 22230 USA

来源：

SPEECH COMMUNICATION | 2004年 / 44卷 / 1-4期

基金：

美国国家科学基金会;

关键词：

audiovisual speech processing; speech detection in noise; speech in noise; audiovisual speech perception; speech processing; lipreading; speechreading;

D O I：

10.1016/j.specom.2004.10.011

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Audiovisual speech stimuli have been shown to produce a variety of perceptual phenomena. Enhanced detectability of acoustic speech in noise, when the talker can also be seen, is one of those phenomena. This study investigated whether this enhancement effect is specific to visual speech stimuli or can rely on more generic non-speech visual stimulus properties. Speech detection thresholds for an auditory /ba/ stimulus were obtained in a white noise masker. The auditory /ba/ was presented adaptively to obtain its 79.4% detection threshold under five conditions. In Experiment 1, the syllable was presented (1) auditory-only (AO) and (2) as audiovisual speech (AVS), using the original video recording. Three types of synthetic visual stimuli were also paired synchronously with the audio token: (3) A dynamic Lissajous (AVL) figure whose vertical extent was correlated with the acoustic speech envelope; (4) a dynamic rectangle (AVR) whose horizontal extent was correlated with the speech envelope; and (5) a static rectangle (AVSR) whose onset and offset were synchronous with the acoustic speech onset and offset. Ten adults with normal hearing and vision participated. The results, in terms of dB signal-to-noise ratio (SNR), were AVS < (AVL approximate to AVR approximate to ASR) < AO. That is, AVS was significantly easiest to detect, there was no difference among the synthesized visual stimuli, and all audiovisual conditions resulted in significantly lower thresholds than AO. To determine the advantage of the AVS stimulus, in Experiment 2, a preliminary mouth gesture was edited from the video speech token. This manipulation defeated the advantage for both the original and the edited AVS stimulus, while the audiovisual detection enhancement persisted. Overall, the results showed enhanced auditory speech detection with visual stimuli but no advantage for a fine-grained correlation between acoustic and optical speech signals. (C) 2004 Elsevier B.V. All rights reserved.

引用

页码：5 / 18

页数：14

共 32 条

[1] [Anonymous], 1989, S361989 ANSI
[2] Bisensory augmentation: A speechreading advantage when speech is clearly audible and intact
Arnold, P
Hill, F
[J]. BRITISH JOURNAL OF PSYCHOLOGY, 2001, 92 : 339 - 355
[3] Bernstein L., 2004, HDB MULTISENSORY PRO
[4] Speech perception without hearing
Bernstein, LE
Demorest, ME
Tucker, PE
[J]. PERCEPTION & PSYCHOPHYSICS, 2000, 62 (02): : 233 - 252
[5] Electrophysiology of spatial scene analysis: the mismatch negativity (MMN) is sensitive to the ventriloquism illusion
Colin, C
Radeau, M
Soquet, A
Dachy, B
Deltenre, P
[J]. CLINICAL NEUROPHYSIOLOGY, 2002, 113 (04) : 507 - 518
[6] Multisensory integration, perception and ecological validity
De Gelder, B
Bertelson, P
[J]. TRENDS IN COGNITIVE SCIENCES, 2003, 7 (10) : 460 - 467
[7] Flow of activation from V1 to frontal cortex in humans - A framework for defining "early" visual processing
Foxe, JJ
Simpson, GV
[J]. EXPERIMENTAL BRAIN RESEARCH, 2002, 142 (01) : 139 - 150
[8] The effect of speechreading on masked detection thresholds for filtered speech
Grant, KW
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (05) : 2272 - 2275
[9] The use of visible speech cues for improving auditory detection of spoken sentences
Grant, KW
Seitz, PF
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2000, 108 (03) : 1197 - 1208
[10] Krolak-Salmon P., 2001, Society for Neuroscience Abstracts, V27, P913

← 1 2 3 4 →