A GENERALIZED SUFFIX TREE AND ITS (UN)EXPECTED ASYMPTOTIC BEHAVIORS

被引:50
作者
SZPANKOWSKI, W
机构
[1] Purdue Univ, West Lafayette, IN
关键词
GENERALIZED SUFFIX TREES; ALGORITHMS ON WORDS; DATA COMPRESSION; HEIGHT; SHORTEST PATH; TYPICAL DEPTH AND DEPTH OF INSERTION; PROBABILISTIC MODELS; MIXING CONDITION; RENYIS ENTROPY;
D O I
10.1137/0222070
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Suffix trees find several applications in computer science and telecommunications, most notably in algorithms on strings, data compressions, and codes. Despite this, very little is known about their typical behaviors. In a probabilistic framework, a family of suffix trees-further called b-suffix trees-built from the first n suffixes of a random word is considered. In this family a noncompact suffix tree (i.e., such that every edge is labeled by a single symbol) is represented by b = 1, and a compact suffix tree (i.e., without unary nodes) is asymptotically equivalent to b --> infinity as n --> infinity. Several parameters of b-suffix trees are studied, namely, the depth of a given suffix, the depth of insertion, the height and the shortest feasible path. Some new results concerning typical (i.e., almost sure) behaviors of these parameters are established. These findings are used to obtain several insights into certain algorithms on words, molecular biology, and universal data compression schemes.
引用
收藏
页码:1176 / 1198
页数:23
相关论文
共 44 条
[1]  
Aho A. V., 1974, DESIGN ANAL COMPUTER
[2]   OPTIMAL OFF-LINE DETECTION OF REPETITIONS IN A STRING [J].
APOSTOLICO, A ;
PREPARATA, FP .
THEORETICAL COMPUTER SCIENCE, 1983, 22 (03) :297-315
[3]   SELF-ALIGNMENTS IN WORDS AND THEIR APPLICATIONS [J].
APOSTOLICO, A ;
SZPANKOWSKI, W .
JOURNAL OF ALGORITHMS, 1992, 13 (03) :446-467
[4]  
Apostolico A, 1985, COMBINATORIAL ALGO F, V12, P85
[5]  
Billingsley P., 1965, ERGODIC THEORY INFOR
[6]   AVERAGE SIZES OF SUFFIX TREES AND DAWGS [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D .
DISCRETE APPLIED MATHEMATICS, 1989, 24 (1-3) :37-45
[7]   FAST STRING SEARCHING ALGORITHM [J].
BOYER, RS ;
MOORE, JS .
COMMUNICATIONS OF THE ACM, 1977, 20 (10) :762-772
[8]  
CHANG W, 1990, 1990 P FOCS, P116
[9]   ON THE APPLICATION OF THE BOREL-CANTELLI LEMMA [J].
CHUNG, KL ;
ERDOS, P .
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1952, 72 (JAN) :179-186
[10]  
CSISZAR I, 1981, INFORMATION THEORY C