Audio-visual integration in multimodal communication

被引：130

作者：

Chen, T ^{[1
]}

Rao, RR ^{[1
]}

机构：

[1] AT&T Bell Labs, Holmdel, NJ 07733 USA

来源：

PROCEEDINGS OF THE IEEE | 1998年 / 86卷 / 05期

关键词：

image analysis; multimedia communication; speech communication; speech processing; video signal processing;

D O I：

10.1109/5.664274

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we review recent research that examines audio-visual integration in multimodal communication. The topics include bimodality in human speech, human and automated lip reading, facial animation, lip synchronization, joint audio-video coding, and bimodal speaker verification. We also study the enabling technologies for these research topics, including automatic facial-feature tracking and audio-to-visual mapping. Recent progress in audio-visual research shows that joint processing of audio and video provides advantages that are not available when the audio and video are processed independently.

引用

页码：837 / 852

页数：16

共 70 条

[41] VISEMES OBSERVED BY HEARING-IMPAIRED AND NORMAL-HEARING ADULT VIEWERS [J].

OWENS, E ;

BLAZEK, B .

JOURNAL OF SPEECH AND HEARING RESEARCH, 1985, 28 (03) :381-393

[42]

PARKE FI, 1982, IEEE COMPUT GRAPH, V2, P61

[43]

PETAJAN ED, P CHI88, P19

[44]

PETAJAN ED, 1984, P IEEE GLOB TEL C AT, P265

[45]

POTAMIANOS G, 1997, P EUR TUT WORKSH AUD

[46]

PRASAD K, 1993, CRCTR 9326

[47]

Rabiner L., 1993, Fundamentals of Speech Recognition

[48]

RAO R, P ICIP 95 WASH, P556

[49]

RAO R, 1995, P S MULT COMM VID CO, P301

[50]

RESIBERG D, 1987, HEARING EYE PSYCHOL, P97

← 1 2 3 4 5 6 7 →