Sparse Autoencoder-based Feature Transfer Learning for Speech Emotion Recognition

被引:270
作者
Deng, Jun [1 ]
Zhang, Zixing [1 ]
Marchi, Erik [1 ]
Schuller, Bjoern [1 ]
机构
[1] Tech Univ Munich, MMK, Machine Intelligence & Signal Proc Grp, D-80290 Munich, Germany
来源
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII) | 2013年
关键词
speech emotion recognition; transfer learning; sparse autoencoder; deep neural networks;
D O I
10.1109/ACII.2013.90
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speech emotion recognition, training and test data used for system development usually tend to fit each other perfectly, but further 'similar' data may be available. Transfer learning helps to exploit such similar data for training despite the inherent dissimilarities in order to boost a recogniser's performance. In this context, this paper presents a sparse autoencoder method for feature transfer learning for speech emotion recognition. In our proposed method, a common emotion-specific mapping rule is learnt from a small set of labelled data in a target domain. Then, newly reconstructed data are obtained by applying this rule on the emotion-specific data in a different domain. The experimental results evaluated on six standard databases show that our approach significantly improves the performance relative to learning each source domain independently.
引用
收藏
页码:511 / 516
页数:6
相关论文
共 26 条
[21]   Deep Neural Networks for Acoustic Modeling in Speech Recognition [J].
Hinton, Geoffrey ;
Deng, Li ;
Yu, Dong ;
Dahl, George E. ;
Mohamed, Abdel-rahman ;
Jaitly, Navdeep ;
Senior, Andrew ;
Vanhoucke, Vincent ;
Patrick Nguyen ;
Sainath, Tara N. ;
Kingsbury, Brian .
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) :82-97
[22]   A Survey on Transfer Learning [J].
Pan, Sinno Jialin ;
Yang, Qiang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (10) :1345-1359
[23]  
Schuller B., 2009, P INTERSPEECH, P2794
[24]   Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies [J].
Schuller, Bjoern ;
Vlasenko, Bogdan ;
Eyben, Florian ;
Woellmer, Martin ;
Stuhlsatz, Andre ;
Wendemuth, Andreas ;
Rigoll, Gerhard .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2010, 1 (02) :119-131
[25]   Being bored? Recognising natural interest by extensive audiovisual integration for real-life application [J].
Schuller, Bjoern ;
Mueller, Ronald ;
Eyben, Florian ;
Gast, Juergen ;
Hoernler, Benedikt ;
Woellmer, Martin ;
Rigoll, Gerhard ;
Hoethker, Anja ;
Konosu, Hitoshi .
IMAGE AND VISION COMPUTING, 2009, 27 (12) :1760-1774
[26]  
Stuhlsatz A, 2011, INT CONF ACOUST SPEE, P5688