IMPROVED SPEECH RECOGNITION THROUGH VIDEOTELEPHONY - EXPERIMENTS WITH THE HARD-OF-HEARING

被引:16
作者
FROWEIN, HW [1 ]
SMOORENBURG, GF [1 ]
PYTERS, L [1 ]
SCHINKEL, D [1 ]
机构
[1] STATE UNIV UTRECHT HOSP,DEPT EXPTL AUDIOL,3511 GV UTRECHT,NETHERLANDS
关键词
D O I
10.1109/49.81956
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Three experiments were carried out to assess the potential effectiveness of videotelephony as an adjunctive aid to speech reception by the hard of hearing. The experimental procedure consisted of audio-only and audio-visual presentation of standard prerecorded sentences to hard-of-hearing subjects who subsequently had to repeat as many words as possible. The percentage of correctly reported syllables was taken as the speech reception score. Several independent variables related to picture quality parameters. The most important of these were the temporal resolution (frame rate) and the spatial resolution of the video image. The spatial resolutions were QCIF (180 x 144 pixels) and 1CIF (360 x 288 pixels), and for both these resolutions the image was processed through a 64 kb/s codec. The results showed that the speech readability of a video image improves as the frame rate is increased to 15 Hz and that a further increase in frame rate does not result in a further improvement of speech readability. Audio-visual presentation of 64 kb/s coded images with either QCIF or 1CIF led to significantly better speech reception than audio-only, but the scores were lower than the scores for comparable analog broadband images. Implications of these findings are: 1) 64 kb/s videotelephony is potentially effective in improving speech reception for the hard of hearing; 2) In the design and selection of video codecs for this purpose, a frame rate of 15 Hz is recommended; 3) Both QCIF and 1CIF should be regarded as suitable resolutions.
引用
收藏
页码:611 / 616
页数:6
相关论文
共 9 条
[1]  
BOSMAN AJ, 1990, THESIS STATE U UTREC
[2]  
BREEUWER M, 1985, THESIS FREE U AMSTER
[3]  
FROWEIN HW, 1990, 322 PTT RES DOC
[4]  
LO T, 1990, 13TH P S HUM FACT TE
[5]  
Ostberg O., 1989, International Journal of Human-Computer Interaction, V1, P149, DOI 10.1080/10447318909525963
[6]   VISUAL COMMUNICATION-SYSTEMS FOR THE DEAF [J].
PEARSON, D .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1981, 29 (12) :1986-1992
[7]  
PLOMPEN R, 1989, THESIS PTT RES NEHER
[8]  
PYTERS L, 1990, VIDEOTELEPHONY HARD
[9]   VIDEO TRANSMISSION OF AMERICAN SIGN LANGUAGE AND FINGER SPELLING - PRESENT AND PROJECTED BANDWIDTH REQUIREMENTS [J].
SPERLING, G .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1981, 29 (12) :1993-2002