Text detection from natural scene images: Towards a system for visually impaired persons

被引:100
作者
Ezaki, N [1 ]
Bulacu, M [1 ]
Schomaker, L [1 ]
机构
[1] Toba Natl Coll Maritime Technol, Toba, Mie 5178501, Japan
来源
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2 | 2004年
关键词
D O I
10.1109/ICPR.2004.1334351
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a system that reads the text encountered in natural scenes with the aim to provide assistance to the visually impaired persons. This paper describes the system design and evaluates several character extraction methods. Automatic text recognition from natural images receives a growing attention because of potential applications in image retrieval, robotics and intelligent transport system. Camera-based document analysis becomes a real possibility with the increasing resolution and availability of digital cameras. However, in the case of a blind person, finding the text region is the first important problem that must be addressed, because it cannot be assumed that the acquired image contains only characters. At first, our system tries to find in the image areas with small characters. Then it zooms into the found areas to retake higher resolution images necessary for character recognition. In the present paper, we propose four character-extraction methods based on connected components. We tested the effectiveness of our methods on the ICDAR 2003 Robust Reading Competition data. The performance of the different methods depends on character size. In the data, bigger characters are more prevalent and the most effective extraction method proves to be the sequence: Sobel edge detection, Otsu binarization, connected component extraction and rule-based connected component filtering.
引用
收藏
页码:683 / 686
页数:4
相关论文
共 10 条
[1]  
Doermann D, 2003, PROC INT CONF DOC, P606
[2]  
EZAKI N, 2003, HUMAN COMPUTER INT 2, V2, P48
[3]  
GU L, 1997, IEICE JAPAN J, V80, P2696
[4]  
LIU Y, 1998, IEICE JAPAN J, V81, P641
[5]  
Lucas SM, 2003, PROC INT CONF DOC, P682
[6]  
Matsuo K.-I., 2002, Transactions of the Institute of Electrical Engineers of Japan, Part C, V122-C, P232
[7]   THRESHOLD SELECTION METHOD FROM GRAY-LEVEL HISTOGRAMS [J].
OTSU, N .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1979, 9 (01) :62-66
[8]  
Yamaguchi T, 2003, PROC INT CONF DOC, P359
[9]  
Yang J., 2001, P 1 INT C HUM LANG T, P1
[10]   A video based interface to textual information for the visually impaired [J].
Zandifar, A ;
Duraiswami, R ;
Chahine, A ;
Davis, LS .
FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, :325-330