Speech Emotion Recognition Using Canonical Correlation Analysis and Probabilistic Neural Network

被引:7
作者
Cen, Ling [1 ]
Ser, Wee [2 ]
Yu, Zhu Liang [2 ]
机构
[1] ASTAR, Inst Infocomm Res, 1 Fusionopolis Way, Singapore 138632, Singapore
[2] Nanjing Univ Technol, Ctr Signal Proc, Singapore 639798, Singapore
来源
SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS | 2008年
关键词
D O I
10.1109/ICMLA.2008.85
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, automatic identification of emotional states from human speech is addressed While several papers have been published in the literature on speech emotion recognition, the features used are taken or modified from those used for speech recognition purposes. However, not all features used for speech recognition are of equal importance for emotion recognition. This paper addresses this issue and proposes a systematic method on feature selection for emotion recognition from speech signals. The idea is to work on a well-selected small feature set and use it to remove irrelevant information. Specifically, the proposed method uses the similar idea of the Canonical Correlation Analysis (CCA) to estimate the linear relationship between the various features and the emotional states. The outcome is a set of features that are of most relevance to the emotions. Experiments have been conducted using the LDC database and with the use of the Probabilistic Neural Network (PNN) as the classification method. The results obtained show that, comparable accuracies can be obtained for the emotional states tested with the use of only about 30% of the features considered. This implies that the computational load can be reduced greatly too.
引用
收藏
页码:859 / +
页数:2
相关论文
共 14 条
[1]  
AMIR N, 2001, CLASSIFYING EMOTIONS
[2]  
[Anonymous], INT J HUMAN COMPUTER, DOI DOI 10.1016/S1071-581(02)00141-6
[3]   Emotion recognition in human-computer interaction [J].
Cowie, R ;
Douglas-Cowie, E ;
Tsapatsoulis, N ;
Votsis, G ;
Kollias, S ;
Fellenz, W ;
Taylor, JG .
IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (01) :32-80
[4]  
Dellaert F, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1970, DOI 10.1109/ICSLP.1996.608022
[5]   Relations between two sets of variates [J].
Hotelling, H .
BIOMETRIKA, 1936, 28 :321-377
[6]   Toward detecting emotions in spoken dialogs [J].
Lee, CM ;
Narayanan, SS .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (02) :293-303
[7]  
Lee CM, 2001, ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, P240, DOI 10.1109/ASRU.2001.1034632
[8]  
Petrushin V. A., 2000, P 6 INT C SPOK LANG
[9]  
RONG J, 2007, IEEE ACIS INT C COMP, V11, P419
[10]  
SER W, 2008, 19 INT C PATT REC IC