New approach in quantification of emotional intensity from the speech signal: emotional temperature

被引：66

作者：

Alonso, Jesus B. ^{[1
]}

Cabrera, Josue ^{[1
]}

Medina, Manuel ^{[1
]}

Travieso, Carlos M. ^{[1
]}

机构：

[1] Univ Las Palmas Gran Canaria, Inst Univ Desarrollo Tecnol & Innovac Comunicac I, Las Palmas Gran Canaria 35017, Spain

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2015年 / 42卷 / 24期

关键词：

Emotional speech recognition; Pattem recognition; Emotional intensity; Emotional temperature; RECOGNITION; SELECTION; FEATURES;

D O I：

10.1016/j.eswa.2015.07.062

中图分类号：

TP18 [人工智能理论];

学科分类号：

140502 [人工智能];

摘要：

The automatic speech emotion recognition has a huge potential in applications of fields such as psychology, psychiatry and the affective computing technology. The spontaneous speech is continuous, where the emotions are expressed in certain moments of the dialogue, given emotional turns. Therefore, it is necessary that the real-time applications are capable of detecting changes in the speaker's affective state. In this paper, we emphasize on recognizing activation from speech using a few feature set obtained from a temporal segmentation of the speech signal of different language like German, English and Polish. The feature set includes two prosodic features and four paralinguistic features related to the pitch and spectral energy balance. This segmentation and feature set are suitable for real-time emotion applications because they allow detect changes in the emotional state with very low processing times. The German Corpus EMO-DB (Berlin Database of Emotional Speech), the English Corpus LDC (Emotional Prosody Speech and Transcripts database) and the Polish Emotional Speech Database are used to train the Support Vector Machine (SVM) classifier and for gender-dependent activation recognition. The results are analyzed for each speech emotion with gender-dependent separately and obtained accuracies of 94.9%, 88.32% and 90% for EMO-DB, LDC and Polish databases respectively. This new approach provides a comparable performance with lower complexity than other approaches for real-time applications, thus making it an appealing alternative, may assist in the future development of automatic speech emotion recognition systems with continuous tracking. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：9554 / 9564

页数：11

共 75 条

[1]

Automatic generation of emotions in tutoring agents for affective e-learning in medical education [J].

Alepis, Efthymios ;

Virvou, Maria .

EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (08) :9840-9847

[2]

Facial emotion recognition using empirical mode decomposition [J].

Ali, Hasimah ;

Hariharan, Muthusamy ;

Yaacob, Sazali ;

Adom, Abdul Hamid .

EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (03) :1261-1277

[3]

Alonso J. B., 2001, EURASIP Journal on Applied Signal Processing, V2001, P275, DOI 10.1155/S1110865701000336

[4]

Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection [J].

Altun, Halis ;

Polat, Goekhan .

EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) :8197-8203

[5]

Amol TK, 2014, 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), P1069, DOI 10.1109/ICACCI.2014.6968337

[6]

[Anonymous], 2003, INTERSPEECH

[7]

[Anonymous], INTERSPEECH

[8]

[Anonymous], INTERSPEECH

[9]

[Anonymous], 2002, Emotional prosody speech and transcripts'

[10]

Balti H., 2014, P IEEE S COMP COMM I, P1, DOI DOI 10.1109/ISCC.2014.6912616

← 1 2 3 4 5 6 7 8 →