Fast handwriting recognition for indexing historical documents

被引:14
作者
Govindaraju, V [1 ]
Xue, HH [1 ]
机构
[1] SUNY Buffalo, Ctr Excellence Document Anal & Recognit, Buffalo, NY 14228 USA
来源
FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS | 2004年
关键词
D O I
10.1109/DIAL.2004.1263260
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 [计算机科学与技术];
摘要
Handwriting Recognition (HR) has been successfully used in several applications such as postal address interpretation [1], bank check reading [2], and forms reading[3]. These applications are all characterized by small or fixed lexicons afforded by contextual knowledge. Machine recognition of handwriting in historical documents presents two primary challenges: (i) large lexicons (over 10, 000 words) leading to low recognition accuracy (less than 50%) and (ii) a need for high speed HR given the millions of handwritten manuscripts in Digital Library repositories and that the speed is usually inversely proportional to lexicon size. This paper addresses the issue of speed when dealing with large lexicons. We present several techniques to improve the processing speed for a gain of up to 7 times in matching time and describe a method whereby the large lexicon is divided into smaller sets and processed in parallel. With 4 processors 18 times speedup for the matching phase is achieved.
引用
收藏
页码:314 / 320
页数:7
相关论文
共 13 条
[1]
Impedovo S., 1997, AUTOMATIC BANKCHECK, V28
[2]
KEATON HGP, 1997, P IEEE WORKSH DOC IM
[3]
A lexicon driven approach to handwritten word recognition for real-time applications [J].
Kim, G ;
Govindaraju, V .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) :366-379
[4]
An architecture for handwritten text recognition systems [J].
Kim G. ;
Govindaraju V. ;
Srihari S.N. .
International Journal on Document Analysis and Recognition, 1999, 2 (1) :37-44
[5]
Madhvanath S., 1995, Proceedings of the Third International Conference on Document Analysis and Recognition, P82, DOI 10.1109/ICDAR.1995.598949
[6]
MANMATHA EMR, 1996, DIGITAL LIB 96 1 ACM, P151
[7]
On the influence of vocabulary size and language models in unconstrained handwritten text recognition [J].
Marti, UV ;
Bunke, H .
SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, :260-265
[8]
Twenty years of document image analysis in PAMI [J].
Nagy, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (01) :38-62
[9]
An off-line cursive handwriting recognition system [J].
Senior, AW ;
Robinson, AJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (03) :309-321
[10]
Srihari SN, 1997, PROC INT CONF DOC, P892, DOI 10.1109/ICDAR.1997.620640