NONLINEAR-ANALYSIS AND CLASSIFICATION OF SPEECH UNDER STRESSED CONDITIONS

被引:78
作者
CAIRNS, DA
HANSEN, JHL
机构
[1] Robust Speech Processing Laboratory, Department of Electrical Engineering
[2] Duke University, North Carolina 27708-0291, Durham
关键词
D O I
10.1121/1.410601
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The speech production system is capable of conveying an abundance of information with regards to sentence text, speaker identity, prosodics, as well as emotion and speaker stress. In an effort to better understand the mechanism of human voice communication, researchers have attempted to determine reliable acoustic indicators of stress using such speech production features as fundamental frequency (F0), intensity, spectral tilt, the distribution of spectral energy, and others. Their findings indicate that more work is necessary to propose a general solution. In this study, we hypothesize that speech consists of a linear and nonlinear component, and that the nonlinear component changes markedly between normal and stressed speech. To quantify the changes between normal and stressed speech, a classification procedure was developed based on the nonlinear Teager Energy operator. The Teager Energy operator provides an indirect means of evaluating the nonlinear component of speech. The system was tested using VC and CVC utterances from native speakers of English across the following speaking styles; neutral, loud, angry, Lombard effect, and clear. Results of the system evaluation show that loud and angry speech can be differentiated from neutral speech, while clear speech is more difficult to differentiate. Results also show that reliable classification of Lombard effect speech is possible, but system performance varies across speakers. © 1994, Acoustical Society of America. All rights reserved.
引用
收藏
页码:3392 / 3400
页数:9
相关论文
共 45 条
[1]  
CAIRNS DA, 1992, ICSLP92 INT C SPOKEN, V2, P703
[2]  
CAIRNS DA, 1991, THESIS DUKE U DURHAM
[3]  
Gray R. M., 1984, IEEE ASSP Magazine, V1, P4, DOI 10.1109/MASSP.1984.1162229
[4]  
Hansen J. H. L., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), P266, DOI 10.1109/ICASSP.1989.266416
[5]  
Hansen J. H. L., 1988, THESIS GEORGIA I TEC
[6]  
HANSEN JHL, 1994, IEEE T SPEECH AUDIO, V2
[7]  
HANSEN JHL, 1987, 114TH P AC SOC AM
[8]  
HANSEN JHL, 1992, 6TH EUR SIGN PROCESS, P403
[9]  
HANSEN JHL, 1989, 15TH P IEEE ANN NE B, P31
[10]  
HANSON H, 1993, IEEE P INT C ACOUSTI, V2, P716