Being bored? Recognising natural interest by extensive audiovisual integration for real-life application

被引:110
作者
Schuller, Bjoern [1 ]
Mueller, Ronald [2 ]
Eyben, Florian [1 ]
Gast, Juergen [1 ]
Hoernler, Benedikt [1 ]
Woellmer, Martin [1 ]
Rigoll, Gerhard [1 ]
Hoethker, Anja [3 ]
Konosu, Hitoshi [4 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80333 Munich, Germany
[2] Altran Technol, D-80636 Munich, Germany
[3] Toyota Motor Europe, Prod Engn Adv Technol, B-1930 Zaventem, Belgium
[4] Toyota Motor Co Ltd, Toyota, Aichi 4718571, Japan
关键词
Interest recognition; Affective computing; Audiovisual processing; RECOGNITION; EXPRESSIONS;
D O I
10.1016/j.imavis.2009.02.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic detection of the level of human interest is of high relevance for many technical applications, such as automatic customer care or tutoring systems. However, the recognition of spontaneous interest in natural conversations independently of the subject remains a challenge. Identification of human affective states relying on single modalities only is often impossible, even for humans, since different modalities contain partially disjunctive cues. Multimodal approaches to human affect recognition generally are shown to boost recognition performance, yet are evaluated in restrictive laboratory settings only. Herein we introduce a fully automatic processing combination of Active-Appearance-Model-based facial expression, vision-based eye-activity estimation, acoustic features, linguistic analysis, non-linguistic vocalisations, and temporal context information in an early feature fusion process. We provide detailed subject-independent results for classification and regression of the Level of Interest using Support-Vector Machines on an audiovisual interest corpus (AVIC) consisting of spontaneous, conversational speech demonstrating "theoretical" effectiveness of the approach. Further, to evaluate the approach with regards to real-life usability a user-study is conducted for proof of "practical" effectiveness. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:1760 / 1774
页数:15
相关论文
共 80 条
[21]  
COOTES T, 1998, P BRIT MACH VIS C, V2, P680
[22]   Emotion recognition in human-computer interaction [J].
Cowie, R ;
Douglas-Cowie, E ;
Tsapatsoulis, N ;
Votsis, G ;
Kollias, S ;
Fellenz, W ;
Taylor, JG .
IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (01) :32-80
[23]  
DECAIRE MW, 2000, DETECTION DECEPTION
[24]  
EKMAN P, 1999, HDB COGNITION EMOTIO, pCH16
[25]  
El Kaliouby R., 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop, P154, DOI DOI 10.1109/CVPR.2004.153
[26]  
Friedman J., ADDITIVE LOGISTIC RE
[27]  
Gatica-Perez D., 2005, P IEEE INT C AC SPEE
[28]  
GOTO M, 1999, EUROSPEECH 99, P227
[29]  
Grimm M, 2007, LECT NOTES COMPUT SC, V4738, P126
[30]  
Gu HS, 2004, SIXTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, P111