Speech perception at the interface of neurobiology and linguistics

被引:323
作者
Poeppel, David [1 ,2 ]
Idsardi, William J. [1 ]
van Wassenhove, Virginie [3 ]
机构
[1] Univ Maryland, Dept Linguist, College Pk, MD 20742 USA
[2] Univ Maryland, Dept Biol, College Pk, MD 20742 USA
[3] CALTECH, Div Biol, Pasadena, CA 91125 USA
关键词
multi-time resolution; temporal coding; analysis-by-synthesis; predictive coding; forward model; distinctive features;
D O I
10.1098/rstb.2007.2160
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20 - 80 ms, approx. 150 - 300 ms), commensurate with (sub) segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an 'analysis-by-synthesis' approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.
引用
收藏
页码:1071 / 1086
页数:16
相关论文
共 95 条
[51]   Perception of asynchronous and conflicting visual and auditory speech [J].
Massaro, DW ;
Cohen, MM ;
Smeele, PMT .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (03) :1777-1786
[52]   Spectro-temporal processing during speech perception involves left posterior auditory cortex [J].
Meyer, M ;
Zaehle, T ;
Gountouna, VE ;
Barron, A ;
Jancke, L ;
Turk, A .
NEUROREPORT, 2005, 16 (18) :1985-1989
[53]  
Miller G. A., 1960, Plans and the structure of behavior, P222, DOI DOI 10.1037/10039-000
[54]   AN ANALYSIS OF PERCEPTUAL CONFUSIONS AMONG SOME ENGLISH CONSONANTS [J].
MILLER, GA ;
NICELY, PE .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1955, 27 (02) :338-352
[55]  
Moore B.C.J., 1989, INTRO PSYCHOL HEARIN
[56]   Functional neuroimaging of speech perception in six normal and two aphasic subjects [J].
Mummery, CJ ;
Ashburner, J ;
Scott, SK ;
Wise, RJS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (01) :449-457
[57]   Temporal constraints on the McGurk effect [J].
Munhall, KG ;
Gribble, P ;
Sacco, L ;
Ward, M .
PERCEPTION & PSYCHOPHYSICS, 1996, 58 (03) :351-362
[58]  
Murphy GL., 2002, BIG BOOK CONCEPTS, DOI [10.7551/mitpress/1602.001.0001, DOI 10.7551/MITPRESS/1602.001.0001]
[59]   Distinct time scales in cortical discrimination of natural sounds in songbirds [J].
Narayan, Rajiv ;
Grana, Gilberto ;
Sen, Kamal .
JOURNAL OF NEUROPHYSIOLOGY, 2006, 96 (01) :252-258
[60]   Merging information in speech recognition: Feedback is never necessary [J].
Norris, D ;
McQueen, JM ;
Cutler, A .
BEHAVIORAL AND BRAIN SCIENCES, 2000, 23 (03) :299-+