A DATABASE FOR HANDWRITTEN TEXT RECOGNITION RESEARCH

被引:1471
作者
HULL, JJ
机构
[1] Center of Excellence for Document Analysis and Recognition (CEDAR), Department of Computer Science, State University of New York at Buffalo, Buffalo
关键词
HANDWRITING RECOGNITION; DATABASE; PERFORMANCE ANALYSIS; TESTING;
D O I
10.1109/34.291440
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An image database for handwritten text recognition research is described. Digital images of approximately 5000 city names, 5000 state names, 10000 ZIP Codes, and 50000 alphanumeric characters are included. Each image was scanned from mail in a working post office at 300 pixels/in in 8-bit grayscale on a high-quality flat bed digitizer. The data were unconstrained for the writer, style, and method of preparation. These characteristics help overcome the limitations of earlier databases that contained only isolated characters or were prepared in a laboratory setting under prescribed circumstances. Also, the database is divided into explicit training and testing sets to facilitate the sharing of results among researchers as well as performance comparisons.
引用
收藏
页码:550 / 554
页数:5
相关论文
共 9 条
[1]  
BRADFORD R, 1991, 1ST INT C DOC AN REC, P516
[2]  
Cohen E., 1991, International Journal of Pattern Recognition and Artificial Intelligence, V5, P221, DOI 10.1142/S0218001491000156
[3]  
Ho T. K., 1992, STRUCTURED DOCUMENT, P188, DOI DOI 10.1007/978-3-642-77281-8_8
[4]  
HULL JJ, 1990, APR P INT WORKSH FRO, P117
[5]  
HULL JJ, 1991, SEP P PIX FEAT, P229
[7]  
NAGY G, 1992, P IEEE, V7, P1093
[8]  
SIMON JC, 1991, P PIXELS FEATURES, V3, P1
[9]  
SUEN CY, 1990, APR INT WORKSH FRONT