Efficient implementation of lazy suffix trees

被引:43
作者
Giegerich, R
Kurtz, S
Stoye, J
机构
[1] Univ Bielefeld, AG Genominformat, Fac Technol, D-33594 Bielefeld, Germany
[2] Univ Hamburg, Ctr Bioinformat, D-20146 Hamburg, Germany
关键词
string matching; suffix tree; space-efficient implementation; lazy evaluation; CONSTRUCTION; TIME;
D O I
10.1002/spe.535
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present an efficient implementation of a write-only top-down construction for suffix trees. Our implementation is based on a new, space-efficient representation of suffix trees that requires only 12 bytes per input character in the worst case, and 8.5 bytes per input character on average for a collection of files of different type. We show how to efficiently implement the lazy evaluation of suffix trees such that a subtree is evaluated only when it is traversed for the first time. Our experiments show that for the problem of searching many exact patterns in a fixed input string, the lazy top-down construction is often faster and more space efficient than other methods. Copyright (C) 2003 John Wiley Sons, Ltd.
引用
收藏
页码:1035 / 1049
页数:15
相关论文
共 32 条
[1]  
Allauzen C, 1999, LECT NOTES COMPUT SC, V1725, P295
[2]   EFFICIENT IMPLEMENTATION OF SUFFIX TREES [J].
ANDERSSON, A ;
NILSSON, S .
SOFTWARE-PRACTICE & EXPERIENCE, 1995, 25 (02) :129-141
[3]  
[Anonymous], 1997, ART COMPUTER PROGRAM
[4]   PARALLEL CONSTRUCTION OF A SUFFIX TREE WITH APPLICATIONS [J].
APOSTOLICO, A ;
ILIOPOULOS, C ;
LANDAU, GM ;
SCHIEBER, B ;
VISHKIN, U .
ALGORITHMICA, 1988, 3 (03) :347-365
[5]  
Apostolico A., 1985, Combinatorial Algorithms onWords, P85, DOI DOI 10.1007/978-3-642-82456-26
[6]  
Clark DR, 1996, PROCEEDINGS OF THE SEVENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P383
[7]   A time and space efficient data structure for string searching on large texts [J].
Colussi, L ;
DeCol, A .
INFORMATION PROCESSING LETTERS, 1996, 58 (05) :217-222
[8]  
Cormen TH, 2022, Introduction to algorithms, P390, DOI DOI 10.1145/963770.963776
[9]  
Dorohonceanu B, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P128
[10]   Optimal suffix tree construction with large alphabets [J].
Farach, M .
38TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1997, :137-143