Lipreading from color video

被引:79
作者
Chiou, GI
Hwang, JN
机构
[1] Information Processing Laboratory, Department of Electrical Engineering, University of Washington, Seattle
关键词
active contour model; hidden Markov model; Karhunen-Loeve transform; lipreading; snake; visual phoneme;
D O I
10.1109/83.605417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We have designed and implemented a lipreading system that recognizes isolated words using only color video of human lips (without acoustic data). The system performs video recognition using ''snakes'' to extract visual features of geometric space, Karhunen-Loeve transform (KLT) to extract principal components in the color eigenspace, and hidden Markov models (HMM's) to recognize the combined visual features sequences. With the visual information alone, we were able to achieve 94% accuracy for ten isolated words.
引用
收藏
页码:1192 / 1195
页数:4
相关论文
共 19 条
[1]  
[Anonymous], 1992, 3 BRIT MACH VIS C 19
[2]  
[Anonymous], 1995, P INT WORKSH AUT FAC
[3]  
BREGLER C, 1994, INT CONF ACOUST SPEE, P669, DOI 10.1109/ICASSP.1994.389567
[4]   A NEURAL-NETWORK-BASED STOCHASTIC ACTIVE CONTOUR MODEL (NNS-SNAKE) FOR CONTOUR FINDING OF DISTINCT FEATURES [J].
CHIOU, GI ;
HWANG, JN .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1995, 4 (10) :1407-1416
[5]  
CHIOU GI, 1994, IEEE IMAGE PROC, P926, DOI 10.1109/ICIP.1994.413710
[6]  
CHIOU GI, 1996, P INT C AC SPEECH SI, P2156
[7]   FINITE-ELEMENT METHODS FOR ACTIVE CONTOUR MODELS AND BALLOONS FOR 2-D AND 3-D IMAGES [J].
COHEN, LD ;
COHEN, I .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (11) :1131-1147
[8]   CONFUSIONS AMONG VISUALLY PERCEIVED CONSONANTS [J].
FISHER, CG .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1968, 11 (04) :796-&
[9]  
GOLDSCHEN AJ, 1994, CONF REC ASILOMAR C, P572, DOI 10.1109/ACSSC.1994.471517
[10]   SNAKES - ACTIVE CONTOUR MODELS [J].
KASS, M ;
WITKIN, A ;
TERZOPOULOS, D .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1987, 1 (04) :321-331