VOCAL QUALITY FACTORS - ANALYSIS, SYNTHESIS, AND PERCEPTION

被引:337
作者
CHILDERS, DG [1 ]
LEE, CK [1 ]
机构
[1] TATUNG INST TECHNOL,DEPT ELECT ENGN,TAIPEI,TAIWAN
关键词
D O I
10.1121/1.402044
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The purpose of this study was to examine several factors of vocal quality that might be affected by changes in vocal fold vibratory patterns. Four voice types were examined: modal, vocal fry, falsetto, and breathy. Three categories of analysis techniques were developed to extract source-related features from speech and electroglottographic (EGG) signals. Four factors were found to be important for characterizing the glottal excitations for the four voice types: the glottal pulse width, the glottal pulse skewness, the abruptness of glottal closure, and the turbulent noise component. The significance of these factors for voice synthesis was studied and a new voice source model that accounted for certain physiological aspects of vocal fold motion was developed and tested using speech synthesis. Perceptual listening tests were conducted to evaluate the auditory effects of the source model parameters upon synthesized speech. The effects of the spectral slope of the source excitation, the shape of the glottal excitation pulse, and the characteristics of the turbulent noise source were considered. Applications for these research results include synthesis of natural sounding speech, synthesis and modeling of vocal disorders, and the development of speaker independent (or adaptive) speech recognition systems.
引用
收藏
页码:2394 / 2410
页数:17
相关论文
共 77 条
[1]  
ALLEN E L, 1973, Folia Phoniatrica, V25, P241
[2]  
ANANTHAPADMANAB.TV, 1984, Q PROGR STATUS REP 2, P1
[3]  
[Anonymous], 1979, SPEECH LANG, DOI DOI 10.1016/B978-0-12-608601-0.50010-3
[4]   SPEECH WAVE-FORM PERTURBATION ANALYSIS - A PERCEPTUAL ACOUSTICAL COMPARISON OF 7 MEASURES [J].
ASKENFELT, AG ;
HAMMARBERG, B .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1986, 29 (01) :50-64
[5]  
Boone DR, 1971, VOICE VOICE THERAPY
[6]  
Childers D., 1983, VOCAL FOLD PHYSL BIO, P202
[7]   ELECTROGLOTTOGRAPHY AND VOCAL FOLD PHYSIOLOGY [J].
CHILDERS, DG ;
HICKS, DM ;
MOORE, GP ;
ESKENAZI, L ;
LALWANI, AL .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1990, 33 (02) :245-254
[8]  
CHILDERS DG, 1985, CRIT REV BIOMED ENG, V12, P131
[9]   A MODEL FOR VOCAL FOLD VIBRATORY MOTION, CONTACT AREA, AND THE ELECTROGLOTTOGRAM [J].
CHILDERS, DG ;
HICKS, DM ;
MOORE, GP ;
ALSAKA, YA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1986, 80 (05) :1309-1320
[10]   QUALITY OF SPEECH PRODUCED BY ANALYSIS-SYNTHESIS [J].
CHILDERS, DG ;
WU, K .
SPEECH COMMUNICATION, 1990, 9 (02) :97-117