A video based interface to textual information for the visually impaired

被引：15

作者：

Zandifar, A ^{[1
]}

Duraiswami, R ^{[1
]}

Chahine, A ^{[1
]}

Davis, LS ^{[1
]}

机构：

[1] Univ Maryland, Perceputal Interfaces & Real Lab, College Pk, MD 20742 USA

来源：

FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS | 2002年

关键词：

D O I：

10.1109/ICMI.2002.1167016

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We describe the development of an interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech (TTS). The video provides a sequence of low resolution images in which text must be detected, rectified and converted into high resolution rectangular blocks that are capable of being analyzed via off-the-shelf OCR. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.

引用

页码：325 / 330

页数：6

共 20 条

[1]

[Anonymous], MEASURES PSYCHOL VOC

[2]

[Anonymous], Open Computer Vision (OpenCV)

[3]

*ARK CO, ARK CO PROV SOFTW TU

[4]

*BOOKSH ORG, BOOKSH ORG SERV VIS

[5]

Bovik AC., 2000, HDB IMAGE VIDEO PROC

[6] A SURVEY OF IMAGE REGISTRATION TECHNIQUES [J].

BROWN, LG .

COMPUTING SURVEYS, 1992, 24 (04) :325-376

[7]

CHOI JSL, 1999, IEEE T CONSUM ELECTR, V45, P1127

[8]

DAVIS L, 2000, TEXTUAL INFORMATION

[9]

Jain AK., 1989, Fundamentals of Digital Image Processing

[10]

Kuglin C. D., 1975, Proceedings of the 1975 International Conference on Cybernetics and Society, P163

← 1 2 →