Skew detection and text line position determination in digitized documents

被引:68
作者
Gatos, B
Papamarkos, N
Chamzas, C
机构
[1] DEMOCRITUS UNIV THRACE,DEPT ELECT & COMP ENGN,ELECT CIRCUITS ANAL LAB,GR-67100 XANTHI,GREECE
[2] NATL CTR SCI RES DEMOKRITOS,INST INFORMAT & TELECOMMUN,GR-15310 ATHENS,GREECE
关键词
skew detection; Hough transform; character recognition; segmentation;
D O I
10.1016/S0031-3203(96)00157-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a computationally efficient procedure for skew detection and text line position determination in digitized documents, which is based on the cross-correlation between the pixels of vertical lines in a document. The determination of the skew angle in documents is essential in optical character recognition systems. Due to the text skew, each horizontal text line intersects a predefined set of vertical lines at non-horizontal positions. Using only the pixels on these vertical lines we construct a correlation matrix and evaluate the skew angle of the document with high accuracy. In addition, using the same matrix, we compute the positions of text lines in the document. The proposed method is tested on a variety of mixed-type documents and it provides good and accurate results while it requires only a short computational time. We illustrate the effectiveness of the algorithm by presenting four characteristic examples. (C) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.
引用
收藏
页码:1505 / 1519
页数:15
相关论文
共 26 条
[1]  
ABDELAZIM HY, 1989, P VLSI MICROELECTRON, P140
[2]   A PREPROCESSING ALGORITHM FOR HAND-WRITTEN CHARACTER-RECOGNITION [J].
ABDULLA, WH ;
SALEH, AOM ;
MORAD, AH .
PATTERN RECOGNITION LETTERS, 1988, 7 (01) :13-18
[3]  
BAIRD HS, 1987, 40TH P SPSE C S HYBR, P21
[4]  
BIXLER JP, 1988, P ACM C DOC PROC SYS, P177
[5]   OPTICAL CHARACTER-RECOGNITION BY THE METHOD OF MOMENTS [J].
CASH, GL ;
HATAMIAN, M .
COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1987, 39 (03) :291-310
[6]   SYSTEM FOR AN INTELLIGENT OFFICE DOCUMENT ANALYSIS, RECOGNITION AND DESCRIPTION [J].
CHAUVET, P ;
LOPEZKRAHE, J ;
TAFLIN, E ;
MAITRE, H .
SIGNAL PROCESSING, 1993, 32 (1-2) :161-190
[7]  
CIARDIELLO G, 1988, 9TH P INT C PATT REC, P739
[8]   A ROBUST ALGORITHM FOR TEXT STRING SEPARATION FROM MIXED TEXT GRAPHICS IMAGES [J].
FLETCHER, LA ;
KASTURI, R .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1988, 10 (06) :910-918
[9]   Accelerated Hough transform using rectangular image decomposition [J].
Gatos, B ;
Perantonis, SJ ;
Papamarkos, N .
ELECTRONICS LETTERS, 1996, 32 (08) :730-732
[10]  
GATOS B, 1993, P 4 INT C ADV COMM C, P493