A video based interface to textual information for the visually impaired

被引:15
作者
Zandifar, A [1 ]
Duraiswami, R [1 ]
Chahine, A [1 ]
Davis, LS [1 ]
机构
[1] Univ Maryland, Perceputal Interfaces & Real Lab, College Pk, MD 20742 USA
来源
FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS | 2002年
关键词
D O I
10.1109/ICMI.2002.1167016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe the development of an interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech (TTS). The video provides a sequence of low resolution images in which text must be detected, rectified and converted into high resolution rectangular blocks that are capable of being analyzed via off-the-shelf OCR. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.
引用
收藏
页码:325 / 330
页数:6
相关论文
共 20 条
[1]  
[Anonymous], MEASURES PSYCHOL VOC
[2]  
[Anonymous], Open Computer Vision (OpenCV)
[3]  
*ARK CO, ARK CO PROV SOFTW TU
[4]  
*BOOKSH ORG, BOOKSH ORG SERV VIS
[5]  
Bovik AC., 2000, HDB IMAGE VIDEO PROC
[6]   A SURVEY OF IMAGE REGISTRATION TECHNIQUES [J].
BROWN, LG .
COMPUTING SURVEYS, 1992, 24 (04) :325-376
[7]  
CHOI JSL, 1999, IEEE T CONSUM ELECTR, V45, P1127
[8]  
DAVIS L, 2000, TEXTUAL INFORMATION
[9]  
Jain AK., 1989, Fundamentals of Digital Image Processing
[10]  
Kuglin C. D., 1975, Proceedings of the 1975 International Conference on Cybernetics and Society, P163