ANALYSIS, SYNTHESIS, AND PERCEPTION OF VOICE QUALITY VARIATIONS AMONG FEMALE AND MALE TALKERS

被引:996
作者
KLATT, DH [1 ]
KLATT, LC [1 ]
机构
[1] MIT, ELECTR RES LAB, ROOM 36-523, CAMBRIDGE, MA 02139 USA
关键词
D O I
10.1121/1.398894
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice quality variations include a set of voicing sound source modifications ranging from laryngealized to normal to breathy phonation. Analysis of reiterant imitations of two sentences by ten female and six male talkers has shown that the potential acoustic cues to this type of voice quality variation include: (1) increases to the relative amplitude of the fundamental frequency component as open quotient increases; (2) increases to the amount of aspiration noise that replaces higher frequency harmonics as the arytenoids become more separated; (3) increases to lower formant band widths; and (4) introduction of extra pole zeros in the vocal-tract transfer function associated with tracheal coupling. Perceptual validation of the relative importance of these cues for signaling a breathy voice quality has been accomplished using a new voicing source model for synthesis of more natural male and female voices. The new formant synthesizer, KLSYN88, is fully documented here. Results of the perception study indicate that, contrary to previous research which emphasizes the importance of increased amplitude of the fundamental component, aspiration noise is perceptually most important. Without its presence, increases to the fundamental component may induce the sensation of nasality in a high-pitched voice. Further results of the acoustic analysis include the observations that: (1) over the course of a sentence, the acoustic manifestations of breathiness vary considerably—tending to increase for unstressed syllables, in utterance-final syllables, and at the margins of voiceless consonants; (2) on average, females are more breathy than males, but there are very large differences between subjects within each gender; (3) many utterances appear to end in a “breathy-laryngealized” type of vibration; and (4) diplophonic irregularities in the timing of glottal periods occur frequently, especially at the end of an utterance. Diplophonia and other deviations from perfect periodicity may be important aspects of naturalness in synthesis. © 1990, Acoustical Society of America. All rights reserved.
引用
收藏
页码:820 / 857
页数:38
相关论文
共 123 条
[21]   ON CERTAIN IRREGULARITIES OF VOICED-SPEECH WAVEFORMS [J].
DOLANSKY, L ;
TJERNLUND, P .
IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1968, AU16 (01) :51-+
[22]   MEASUREMENT OF PITCH IN SPEECH - AN IMPLEMENTATION OF GOLDSTEIN THEORY OF PITCH PERCEPTION [J].
DUIFHUIS, H ;
WILLEMS, LF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1982, 71 (06) :1568-1580
[23]   GLOTTAL FLOW - MODELS AND INTERACTION [J].
FANT, G .
JOURNAL OF PHONETICS, 1986, 14 (3-4) :393-399
[24]  
Fant G., 1975, 231975 STLQPSR, P1
[25]  
FANT G, 1985, VOCAL FOLD PHYSL BIO, P453
[26]  
FANT G, 1982, 231982 STLQPSR, P1
[27]  
FANT G, 1985, 2 ROYAL I TECHN SPEE, P18
[28]  
FANT G, 1985, 4 PARAMETER MODEL GL, P1
[29]  
FANT G, 1979, 11979 STLQPSR, P85
[30]  
FANT G, 1972, 1 ROYAL I TECHN SPEE, P85