A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions

被引：1658

作者：

Zeng, Zhihong ^{[1
]}

Pantic, Maja ^{[2
,3
]}

Roisman, Glenn I. ^{[4
]}

Huang, Thomas S. ^{[1
]}

机构：

[1] Univ Illinois, Beckman Inst, Urbana, IL 61801 USA

[2] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2AZ, England

[3] Univ Twente, Fac Elect Engn Math & Comp Sci, Enschede, Netherlands

[4] Univ Illinois, Dept Psychol, Champaign, IL 61820 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2009年 / 31卷 / 01期

基金：

美国国家科学基金会; 欧洲研究理事会;

关键词：

Evaluation/methodology; human-centered computing; affective computing; introductory; survey; FACIAL EXPRESSION; EMOTION RECOGNITION; SPEECH; DISCRIMINATION; SEQUENCES; LAUGHTER; FACES;

D O I：

10.1109/TPAMI.2008.52

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.

引用

页码：39 / 58

页数：20

共 160 条

[41] MOTHER INFANT FACE-TO-FACE INTERACTION - THE SEQUENCE OF DYADIC STATES AT 3,6, AND 9 MONTHS [J].

COHN, JF ;

TRONICK, EZ .

DEVELOPMENTAL PSYCHOLOGY, 1987, 23 (01) :68-77

[42] Beyond emotion archetypes: Databases for emotion modelling using neural networks [J].

Cowie, R ;

Douglas-Cowie, E ;

Cox, C .

NEURAL NETWORKS, 2005, 18 (04) :371-388

[43] Emotion recognition in human-computer interaction [J].

Cowie, R ;

Douglas-Cowie, E ;

Tsapatsoulis, N ;

Votsis, G ;

Kollias, S ;

Fellenz, W ;

Taylor, JG .

IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (01) :32-80

[44]

Cowie R., 2000, PROC ISCA WORKSHOP S, P19

[45]

Dellaert F, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1970, DOI 10.1109/ICSLP.1996.608022

[46] Challenges in real-life emotion annotation and machine learning based detection [J].