Text line segmentation of historical documents: a survey

被引:249
作者
Likforman-Sulem, Laurence
Zahour, Abderrazak
Taconet, Bruno
机构
[1] Ecole Natl Super Telecommun TSI, GET, F-75013 Paris, France
[2] CNRS, LTCI, F-75013 Paris, France
[3] Univ Havre GED, IUT, F-76610 Le Havre, France
关键词
segmentation; handwriting; text lines; historical documents; survey;
D O I
10.1007/s10032-006-0023-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.
引用
收藏
页码:123 / 138
页数:16
相关论文
共 60 条
[1]
Off-line Arabic character recognition: The state of the art [J].
Amin, A .
PATTERN RECOGNITION, 1998, 31 (05) :517-530
[2]
Document image analysis for World War II personal records [J].
Antonacopoulos, A ;
Karatzas, D .
FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, :336-341
[3]
ANTONACOPOULOS A, 1994, INT C PATT RECOG, P339, DOI 10.1109/ICPR.1994.576932
[4]
BOUCHE R, 2000, DEBORA DIGITAL ACCES
[5]
BOZZI A, 1995, ERCIM NEWS, V19, P27
[6]
Bruzzone E., 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318), P749, DOI 10.1109/ICDAR.1999.791896
[7]
Calabretto S., 1998, INT J DIGITAL INFORM, V1, P1
[8]
Cohen E., 1991, International Journal of Pattern Recognition and Artificial Intelligence, V5, P221, DOI 10.1142/S0218001491000156
[9]
DEVENTABERT G, 1999, DOCUMENT NUMERIQUE, V3, P57
[10]
DOWNTON A, 2003, P ICDAR 03 ED