A REVIEW OF SEGMENTATION AND CONTEXTUAL ANALYSIS TECHNIQUES FOR TEXT RECOGNITION

被引:33
作者
ELLIMAN, DG [1 ]
LANCASTER, IT [1 ]
机构
[1] PAFEC LTD, NOTTINGHAM NG8 9PE, ENGLAND
关键词
Contextual processing; Dictionary structure; Levenshtein Distance N-gram techniques; Markov methods; Text processing; Text recognition; Text segmentation; Viterbi Algorithm;
D O I
10.1016/0031-3203(90)90021-C
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a review of the literature on text-processing techniques. The techniques covered are segmentation of text and contextual recognition, both of which are required when considering text recognition of documents. Several different methods of text segmentation are compared for various image data formats. The problem of correct segmentation of joined and broken characters is also considered. Two techniques for contextual recognition are considered: the Markov-based methods and dictionary look-up methods. The various techniques for storing dictionary information are compared. A discussion of the importance of the choice of the correct context is given, together with guidance on which methods are best suited to which applications. © 1990.
引用
收藏
页码:337 / 346
页数:10
相关论文
共 58 条
  • [1] ABELE L, 1981, 2ND P SCNAD C IM AN, P177
  • [2] THE WORLDS FASTEST SCRABBLE PROGRAM
    APPEL, AW
    JACOBSON, GJ
    [J]. COMMUNICATIONS OF THE ACM, 1988, 31 (05) : 572 - &
  • [3] THE SMALLEST AUTOMATION RECOGNIZING THE SUBWORDS OF A TEXT
    BLUMER, A
    BLUMER, J
    HAUSSLER, D
    EHRENFEUCHT, A
    CHEN, MT
    SEIFERAS, J
    [J]. THEORETICAL COMPUTER SCIENCE, 1985, 40 (01) : 31 - 55
  • [4] USING KNOWLEDGE IN COMPUTER INTERPRETATION OF HANDWRITTEN FORTRAN CODING SHEETS
    BORNAT, R
    BRADY, JM
    [J]. INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1976, 8 (01): : 13 - 27
  • [5] CASEY RG, 1982, 6TH P INT C PATT REC, V2, P1023
  • [6] A TECHNIQUE FOR COMPUTER DETECTION AND CORRECTION OF SPELLING ERRORS
    DAMERAU, FJ
    [J]. COMMUNICATIONS OF THE ACM, 1964, 7 (03) : 171 - 176
  • [7] Doster W., 1983, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, P515
  • [8] ERMAN LD, 1980, COMPUT SURV, V12, P213, DOI 10.1145/356810.356816
  • [9] A ROBUST ALGORITHM FOR TEXT STRING SEPARATION FROM MIXED TEXT GRAPHICS IMAGES
    FLETCHER, LA
    KASTURI, R
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1988, 10 (06) : 910 - 918
  • [10] VITERBI ALGORITHM
    FORNEY, GD
    [J]. PROCEEDINGS OF THE IEEE, 1973, 61 (03) : 268 - 278