THE IMAGE-INPUT-MICROPHONE - A NEW NONACOUSTIC SPEECH-COMMUNICATION SYSTEM BY MEDIA CONVERSION FROM ORAL MOTION IMAGES TO SPEECH

被引:5
作者
OTANI, K [1 ]
HASEGAWA, T [1 ]
机构
[1] SAITAMA UNIV,FAC ENGN,DEPT ELECT & ELECTR ENGN,URAWA,SAITAMA 338,JAPAN
关键词
D O I
10.1109/49.363147
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a new speech communication system to convert oral motion images into speech. We call this system ''The Image Input Microphone.'' It provides high security and is not affected by acoustic noise because it is not necessary to input the actual utterance. This system is especially promising as a speaking-aid system for people whose vocal cords are injured. Since this is a basic investigation of media conversion from image to speech, we focus on vowels, and conduct experiments on media conversion of vowels. The vocal-tract transfer function and the source signal for driving this filter are estimated from features of the lips. These features are extracted from oral images in a learning data set, then speech is synthesized by this filter inputted with an appropriate driving signal. The performance of this system is evaluated by hearing tests of synthesized speech. The mean recognition rate for the test data set was 76.8%. We also investigate the effects of practice by iterative listening. The mean recognition rate rises from 69.4% to over 90% after four tests over four days. Consequently, we conclude the proposed system has potential as a method of nonacoustic communication.
引用
收藏
页码:42 / 48
页数:7
相关论文
共 14 条
[1]  
Fukuda Y., 1982, J ACOUSTICAL SOC JAP, V3, P75
[2]  
HASEGAWA T, P ICSS ISITA 92, V20, P617
[3]  
KASS M, 1988, INT J COMPUT VISION, P321, DOI DOI 10.1007/BF00133570
[4]  
Markel J. D., 1976, LINEAR PREDICTION SP
[5]  
Mase K., 1990, Transactions of the Institute of Electronics, Information and Communication Engineers D-II, VJ73D-II, P796
[6]  
MITSUMOTO K, 1990, T IPSJ, V31, P444
[7]  
MORISHIMA S, 1991, IEEE J SELECT AREAS, V9
[8]  
OTANI K, 1993, P NOLTA 93, V4, P1355
[9]  
OTANI K, 1993, HC9263 IEICE TECH RE
[10]  
Otsu N, 1980, T I ELECTORN COMMUN, V63, P349