FAST STRING SEARCHING

被引:105
作者
HUME, A [1 ]
SUNDAY, D [1 ]
机构
[1] JOHNS HOPKINS UNIV,APPL PHYS LAB,LAUREL,MD 20723
关键词
STRING SEARCHING; PATTERN MATCHING; BOYER-MOORE;
D O I
10.1002/spe.4380211105
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Since the Boyer-Moore algorithm was described in 1977, it has been the standard benchmark for the practical string search literature. Yet this yardstick compares badly with current practice. We describe two algorithms that perform 47% fewer comparisons and are about 4.5 times faster across a wide range of architectures and compilers. These new variants are members of a family of algorithms based on the skip loop structure of the preferred, but often neglected, fast form of Boyer-Moore. We present a taxonomy for this family, and describe a toolkit of components that can be used to design an algorithm most appropriate for a given set of requirements.
引用
收藏
页码:1221 / 1248
页数:28
相关论文
共 25 条
[1]   THE BOYER-MOORE-GALIL STRING SEARCHING STRATEGIES REVISITED [J].
APOSTOLICO, A ;
GIANCARLO, R .
SIAM JOURNAL ON COMPUTING, 1986, 15 (01) :98-105
[2]   IMPROVED STRING SEARCHING [J].
BAEZAYATES, RA .
SOFTWARE-PRACTICE & EXPERIENCE, 1989, 19 (03) :257-271
[3]  
Bentley J. L., 1982, WRITING EFFICIENT PR
[4]   THE GENBANK GENETIC SEQUENCE DATA-BANK [J].
BILOFSKY, HS ;
BURKS, C .
NUCLEIC ACIDS RESEARCH, 1988, 16 (05) :1861-1863
[5]  
BOYER RS, 1977, COMMUN ACM, V20, P262
[6]  
COLE R, 1990, 512 NEW YORK U COMP
[7]  
Colussi L., 1990, 31ST FOCS ST LOUIS, VI, P135
[8]   DISTRIBUTION OF MATHEMATICAL SOFTWARE VIA ELECTRONIC MAIL [J].
DONGARRA, JJ ;
GROSSE, E .
COMMUNICATIONS OF THE ACM, 1987, 30 (05) :403-407
[9]  
GIANCARLO R, 1990, COMMUNICATION DEC
[10]  
HAERTEL M, 1989, USENET ARCH COMP SOU, V17