Frame rate and viseme analysis for multimedia applications to assist speechreading

被引:10
作者
Williams, JJ [1 ]
Rutledge, JC
Katsaggelos, AK
Garstecki, DC
机构
[1] Northwestern Univ, Dept Elect & Comp Engn, Evanston, IL 60208 USA
[2] Northwestern Univ, Dept Commun Sci & Disorders, Evanston, IL 60208 USA
来源
JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 1998年 / 20卷 / 1-2期
关键词
D O I
10.1023/A:1008062122135
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current video conference and phone systems do not provide the necessary temporal resolution and motion for speechreading. In this paper the perceptual boundaries which effect speechreading performance are investigated. Analysis of the relationships between viseme groupings, accuracy of viseme recognition and presentation frame rate is presented based on the results of subject testing. Results reveal a minimum frame rate of 10 frames per second (fps) for distinguishing viseme groupings. Confusion analysis results demonstrate the importance of the tongue and teeth oral features for speechreading. These results are critical to the design of speech-assisted video systems to enhance speechreading for individuals with impaired hearing.
引用
收藏
页码:7 / 23
页数:17
相关论文
共 24 条
[1]  
[Anonymous], 1988, Nonparametric statistics for the behavioral sciences
[2]   VISUAL INTELLIGIBILITY OF CONSONANTS - LIPREADING SCREENING-TEST WITH IMPLICATIONS FOR AURAL REHABILITATION [J].
BINNIE, CA ;
JACKSON, PL ;
MONTGOMERY, AA .
JOURNAL OF SPEECH AND HEARING DISORDERS, 1976, 41 (04) :530-539
[3]  
DUDA RO, 1983, PATTERN CLASSIFICATI
[4]   CONFUSIONS AMONG VISUALLY PERCEIVED CONSONANTS [J].
FISHER, CG .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1968, 11 (04) :796-&
[5]  
GARSTECKI DC, 1997, AGING COMMUNICATION, P97
[6]  
GEORGE MFS, 1988, VOLTA REV, V90, P17
[7]  
Jeffers J., 1971, Speechreading (lipreading)
[8]  
KEPLER LJ, 1992, EAR HEARING, V13, P331
[9]  
KOCHKIN S, 1996, HEARING J, V49
[10]  
LESNER SA, 1987, EAR HEARING, V8, P283