TECHNIQUES FOR AUTOMATICALLY CORRECTING WORDS IN TEXT

被引:160
作者
KUKICH, K
机构
[1] Bellcore, Morristown, United States
关键词
CONTEXT-DEPENDENT SPELLING CORRECTION; GRAMMAR CHECKING; NATURAL-LANGUAGE-PROCESSING MODELS; NEURAL NET CLASSIFIERS; N-GRAM ANALYSIS; OPTICAL CHARACTER RECOGNITION (OCR); SPELL CHECKING; SPELLING ERROR DETECTION; SPELLING ERROR PATTERNS; STATISTICAL-LANGUAGE MODELS; WORD RECOGNITION AND CORRECTION;
D O I
10.1145/146370.146380
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Research aimed at correcting words in text has focused on three progressively more difficult problems: (1) nonword error detection; (2) isolated-word error correction; and (3) context-dependent word correction. In response to the first problem, efficient pattern-matching and n-gram analysis techniques have been developed for detecting strings that do not appear in a given word list. In response to the second problem, a variety of general and application-specific spelling correction techniques have been developed. Some of them were based on detailed studies of spelling error patterns. In response to the third problem, a few experiments using natural-language-processing tools or statistical-language models have been carried out. This article surveys documented findings on spelling error patterns, provides descriptions of various nonword detection and isolated-word error correction techniques, reviews the state of the art of context-dependent word correction techniques, and discusses research issues related to all three areas of automatic error correction in text.
引用
收藏
页码:377 / 439
页数:63
相关论文
共 195 条
[1]  
ABNEY S, 1990, 6TH P NEW OED C EL T
[2]  
Aho A. V., 1972, SIAM Journal on Computing, V1, P305, DOI 10.1137/0201022
[3]  
Aho A.V., 1990, HDB THEORETICAL COMP
[4]   EFFICIENT STRING MATCHING - AID TO BIBLIOGRAPHIC SEARCH [J].
AHO, AV ;
CORASICK, MJ .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :333-340
[5]   STRING SIMILARITY AND MISSPELLINGS [J].
ALBERGA, CN .
COMMUNICATIONS OF THE ACM, 1967, 10 (05) :302-&
[6]  
ALLEN RB, 1990, ADV NEURAL INFORMATI, V3
[7]  
ALM N, 1992, COMMUN ACM, V35, P46, DOI 10.1145/129875.129879
[8]   AUTOMATIC SPELLING CORRECTION USING A TRIGRAM SIMILARITY MEASURE [J].
ANGELL, RC ;
FREUND, GE ;
WILLETT, P .
INFORMATION PROCESSING & MANAGEMENT, 1983, 19 (04) :255-261
[9]  
[Anonymous], WEBSTERS NEW WORLD M
[10]  
Atwell E., 1987, COMPUTATIONAL ANAL E