Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features

被引:170
作者
D'Mello, Sidney K. [1 ]
Graesser, Arthur [1 ]
机构
[1] Univ Memphis, Inst Intelligent Syst, Memphis, TN 38152 USA
基金
美国国家科学基金会;
关键词
Multimodal affect detection; Conversational cues; Gross body language; Facial features; Superadditivity; AutoTutor; Affective computing; Human-computer interaction; INTELLIGENT TUTORING SYSTEMS; AUTOMATIC DETECTION; LEARNERS AFFECT; STUDENT AFFECT; EXPRESSION; AUTOTUTOR; EMOTIONS; RECOGNITION; COMMUNICATION; EXPERIENCE;
D O I
10.1007/s11257-010-9074-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We developed and evaluated a multimodal affect detector that combines conversational cues, gross body language, and facial features. The multimodal affect detector uses feature-level fusion to combine the sensory channels and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral. Training and validation data for the affect detector were collected in a study where 28 learners completed a 32- min. tutorial session with AutoTutor, an intelligent tutoring system with conversational dialogue. Classification results supported a channel x judgment type interaction, where the face was the most diagnostic channel for spontaneous affect judgments (i.e., at any time in the tutorial session), while conversational cues were superior for fixed judgments (i.e., every 20 s in the session). The analyses also indicated that the accuracy of the multichannel model (face, dialogue, and posture) was statistically higher than the best single-channel model for the fixed but not spontaneous affect expressions. However, multichannel models reduced the discrepancy (i.e., variance in the precision of the different emotions) of the discriminant models for both judgment types. The results also indicated that the combination of channels yielded superadditive effects for some affective states, but additive, redundant, and inhibitory effects for others. We explore the structure of the multimodal linear discriminant models and discuss the implications of some of our major findings.
引用
收藏
页码:147 / 187
页数:41
相关论文
共 117 条
  • [1] AFZAL S, 2009, INT C AFF COMP INT I
  • [2] Allison P. D., 1999, Multiple Regression: A Primer
  • [3] Anderson J., 2005, Experimental cognitive psychology and it's applications, P47
  • [4] Cognitive tutors: Lessons learned
    Anderson, JR
    Corbett, AT
    Koedinger, KR
    Pelletier, R
    [J]. JOURNAL OF THE LEARNING SCIENCES, 1995, 4 (02) : 167 - 207
  • [5] Ang J., 2002, INT C SPOK LANG PROC
  • [6] [Anonymous], HUMAINE HDB IN PRESS
  • [7] [Anonymous], INT J WAVELETS MULTI
  • [8] [Anonymous], WORKSH EM COGN ISS I
  • [9] [Anonymous], C EM INS OUT 130 YEA
  • [10] [Anonymous], 20 INT C IMPR U TEAC