ACOUSTIC INVARIANCE IN SPEECH PRODUCTION - EVIDENCE FROM MEASUREMENTS OF THE SPECTRAL CHARACTERISTICS OF STOP CONSONANTS

被引:277
作者
BLUMSTEIN, SE [1 ]
STEVENS, KN [1 ]
机构
[1] MIT,ELECTR RES LAB,CAMBRIDGE,MA 02139
关键词
D O I
10.1121/1.383319
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
On the basis of theoretical considerations and the results of experiments with synthetic consonant vowel syllables, it has been hypothesized that the short time spectrum sampled at the onset of a stop consonant should exhibit gross properties that uniquely specify the consonantal place of articulation independent of the following vowel. The aim of this paper is to test this hypothesis by measuring the spectrum sampled at the onsets and offsets of a large number of consonant vowel (CV) and vowel consonant (VC) syllables containing both voiced and voiceless stops produced by several speakers. Templates were devised in an attempt to capture three classes of spectral shapes: diffuse-rising, diffuse-falling, and compact, corresponding to alveolar, labial, and velar consonants, respectively. Spectra were derived from the utterances by sampling at the consonantal release of CV syllables and at the implosion and burst release of VC syllables, and these spectra (smoothed by a linear prediction algorithm) were matched against the templates. It was found that about 85% of the spectra at initial consonant release and at final burst release were correctly classified by the templates, although there was some variability across vowel contexts. The spectra sampled at the implosion were not consistently classified. A preliminary examination of spectra sampled at the release of nasal consonants in CV syllables showed a somewhat lower accuracy of classification by the same templates. Overall, the results support an hypothesis that, in natural speech, the acoustic characteristics of stop consonants, specified in terms of the gross spectral shape sampled at the discontinuity in the acoustic signal, show invariant properties independent of the adjacent vowel or of the voicing characteristics of the consonant. The implication is that the auditory system is endowed with detectors that are sensitive to these kinds of gross spectral shapes, and that the existence of these detectors helps the infant to organize the sounds of speech into their natural classes. © 1979, American Association of Physics Teachers. All rights reserved.
引用
收藏
页码:1001 / 1017
页数:17
相关论文
共 44 条
[21]  
Liberman A.M., Harris K.S., Hoffman H.S., Griffith B.C., The discrimination of speech events within and across phonetic boundaries, (1957)
[22]  
Liberman A.M., Cooper F.S., Shankweiler D.P., Perception of the speech code, Psychol, (1967)
[23]  
Maiecot A., The role of releases in the identification of released final stops, Language 34, pp. 370-380, (1958)
[24]  
Miller J.D., Engebretson A.M., Spenner B.F., Cox J.R., Preliminary analyses of speech sounds with a digital model of the ear, (1977)
[25]  
Miller J.L., Nonindependence of feature processing in initial consonants, J. Speech Hear. Res. 20, pp. 519-528, (1977)
[26]  
Miller J.L., Eimas P.D., Studies in the perception of place and manner of articulation: a comparison of the labial alveolar and nasal stop distinctions, (1977)
[27]  
Moffitt A.R., Consonant cue perception by twenty to twenty four week old infants, Child Develop, pp. 42-731, (1971)
[28]  
Morse P.A., The discrimination of speech and non speech stimuli in early infancy, J. Exptl. Child Psychol. 14, pp. 477-492, (1972)
[29]  
Nelson P.G., Erulkar S.D., Bryan J.S., Responses of units of the inferior colliculus to time varying acoustic stimuli, J. Neurophysiol. 29, pp. 834-860, (1966)
[30]  
Pastore R.E., Ahroon W.A., Baffuto K.J., Common-factor model of categorical perception, J. Exptl. Psych.: Hum. Perc. Perf, 3, pp. 686-696, (1977)