Speech emotion recognition approaches in human computer interaction

被引:118
作者
Ramakrishnan, S. [1 ]
El Emary, Ibrahiem M. M. [2 ]
机构
[1] Dr Mahalingam Coll Eng & Tech, Informat Tech Dep, Pollachi 642003, India
[2] King Abdulaziz Univ, Fac Informat Technol, Jeddah 21413, Saudi Arabia
关键词
Speech emotion; Human-computer interface; Pitch and emotion recognition; AUTOMATIC RECOGNITION; CLASSIFICATION;
D O I
10.1007/s11235-011-9624-z
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Speech Emotion Recognition (SER) represents one of the emerging fields in human-computer interaction. Quality of the human-computer interface that mimics human speech emotions relies heavily on the types of features used and also on the classifier employed for recognition. The main purpose of this paper is to present a wide range of features employed for speech emotion recognition and the acoustic characteristics of those features. Also in this paper, we analyze the performance in terms of some important parameters such as: precision, recall, F-measure and recognition rate of the features using two of the commonly used emotional speech databases namely Berlin emotional database and Danish emotional database. Emotional speech recognition is being applied in modern human-computer interfaces and the overview of 10 interesting applications is also presented in this paper to illustrate the importance of this technique.
引用
收藏
页码:1467 / 1478
页数:12
相关论文
共 48 条
[1]  
[Anonymous], 2010, International Journal of Computer Theory and Engineering
[2]  
[Anonymous], IEEE INT C MULT EXP
[3]  
[Anonymous], 2010, INT J COMPUT APPL
[4]   Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection [J].
Busso, Carlos ;
Lee, Sungbok ;
Narayanan, Shrikanth .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04) :582-596
[5]   Detection and Interpretation of Opinion Expressions in Spoken Surveys [J].
Camelin, Nathalie ;
Bechet, Frederic ;
Damnati, Geraldine ;
De Mori, Renato .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (02) :369-381
[6]   A review of speech-based bimodal recognition [J].
Chibelushi, CC ;
Deravi, F ;
Mason, JSD .
IEEE TRANSACTIONS ON MULTIMEDIA, 2002, 4 (01) :23-37
[7]   Emotional speech: Towards a new generation of databases [J].
Douglas-Cowie, E ;
Campbell, N ;
Cowie, R ;
Roach, P .
SPEECH COMMUNICATION, 2003, 40 (1-2) :33-60
[8]  
Elenius K., 2008, INTERSPEECH 9 ANN C
[9]  
Elwakdy M., 2008, INT J CIRCUITS SYSTE, V4, P264
[10]   Emotion Conversion Based on Prosodic Unit Selection [J].
Erro, Daniel ;
Navas, Eva ;
Hernaez, Inma ;
Saratxaga, Ibon .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05) :974-983