Page segmentation of Chinese newspapers

被引:16
作者
Xi, J [1 ]
Hu, JM [1 ]
Wu, LD [1 ]
机构
[1] Fudan Univ, Dept Comp Sci, Shanghai 200433, Fudan, Peoples R China
基金
中国国家自然科学基金;
关键词
document layout analysis; page segmentation; run-length smoothing; minimal spanning tree;
D O I
10.1016/S0031-3203(01)00248-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a new bottom-up method for page segmentation of Chinese document images. Because of some special characteristics of Chinese newspaper documents, many traditional methods developed for English documents fail in segmenting them correctly. Based on run-length smoothing algorithm and minimal spanning tree clustering, the proposed method can resolve the problems of segmenting Chinese documents that differ from English documents. (C) 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:2695 / 2704
页数:10
相关论文
共 9 条
[1]   AUTOMATED ENTRY SYSTEM FOR PRINTED DOCUMENTS [J].
AKIYAMA, T ;
HAGITA, N .
PATTERN RECOGNITION, 1990, 23 (11) :1141-1154
[2]   Page segmentation using the description of the background [J].
Antonacopoulos, A .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1998, 70 (03) :350-369
[3]  
CAI Z, 1994, ALGORITHMS DATA STRU
[4]  
Doermann D., 1999, PAGE SEGMENTATION ZO
[5]   Document representation and its application to page decomposition [J].
Jain, AK ;
Yu, B .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (03) :294-308
[6]   Segmentation of page images using the area Voronoi diagram [J].
Kise, K ;
Sato, A ;
Iwata, M .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1998, 70 (03) :370-382
[7]   THE DOCUMENT SPECTRUM FOR PAGE LAYOUT ANALYSIS [J].
OGORMAN, L .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (11) :1162-1173
[8]   A fast algorithm for bottom-up document layout analysis [J].
Simon, A ;
Pret, JC ;
Johnson, AP .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (03) :273-277
[9]   Automatic document processing: A survey [J].
Tang, YY ;
Lee, SW ;
Suen, CY .
PATTERN RECOGNITION, 1996, 29 (12) :1931-1952