Robust emotion recognition in noisy speech via sparse representation

被引:4
作者
Xiaoming Zhao
Shiqing Zhang
Bicheng Lei
机构
[1] Taizhou University,School of Physics and Electronic Engineering
[2] Taizhou University,Department of Computer Science
来源
Neural Computing and Applications | 2014年 / 24卷
关键词
Emotion recognition; Sparse representation; Compressive sensing; Noisy speech;
D O I
暂无
中图分类号
学科分类号
摘要
Emotion recognition in speech signals is currently a very active research topic and has attracted much attention within the engineering application area. This paper presents a new approach of robust emotion recognition in speech signals in noisy environment. By using a weighted sparse representation model based on the maximum likelihood estimation, an enhanced sparse representation classifier is proposed for robust emotion recognition in noisy speech. The effectiveness and robustness of the proposed method is investigated on clean and noisy emotional speech. The proposed method is compared with six typical classifiers, including linear discriminant classifier, K-nearest neighbor, C4.5 decision tree, radial basis function neural networks, support vector machines as well as sparse representation classifier. Experimental results on two publicly available emotional speech databases, that is, the Berlin database and the Polish database, demonstrate the promising performance of the proposed method on the task of robust emotion recognition in noisy speech, outperforming the other used methods.
引用
收藏
页码:1539 / 1553
页数:14
相关论文
共 141 条
  • [1] Cowie R(2001)Emotion recognition in human-computer interaction IEEE Signal Process Mag 18 32-80
  • [2] Douglas-Cowie E(2005)Toward detecting emotions in spoken dialogs IEEE Trans Speech Audio Process 13 293-303
  • [3] Tsapatsoulis N(2009)Analysis of emotionally salient aspects of fundamental frequency for emotion detection IEEE Trans Audio Speech Lang Process 17 582-596
  • [4] Votsis G(2010)Feature analysis and evaluation for automatic emotion identification in speech IEEE Trans Multimedia 12 490-501
  • [5] Kollias S(2005)Recognition of affective prosody by speakers of English as a first or foreign language Speech Commun 47 351-359
  • [6] Fellenz W(2011)Whodunnit: searching for the most important feature types signalling emotion-related user states in speech Comput Speech Lang 25 4-28
  • [7] Taylor JG(2012)Categorical processing of negative emotions from speech prosody Speech Commun 54 1-10
  • [8] Lee CM(2010)Survey on speech emotion recognition: features, classification schemes, and databases Pattern Recogn 44 572-587
  • [9] Narayanan SS(2003)The role of voice quality in communicating emotion, mood and attitude Speech Commun 40 189-212
  • [10] Busso C(2003)Speech emotion recognition using hidden Markov models Speech Commun 41 603-623