Annotated statistical indices for sequence analysis
被引:5
作者:
Apostolico, A
论文数: 0引用数: 0
h-index: 0
机构:
Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USAPurdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
Apostolico, A
[1
]
Bock, ME
论文数: 0引用数: 0
h-index: 0
机构:
Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USAPurdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
Bock, ME
[1
]
Xu, XY
论文数: 0引用数: 0
h-index: 0
机构:
Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USAPurdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
Xu, XY
[1
]
机构:
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
来源:
COMPRESSION AND COMPLEXITY OF SEQUENCES 1997 - PROCEEDINGS
|
1998年
关键词:
design and analysis of algorithms;
combinatorics on strings;
pattern matching;
substring statistics;
suffix tree;
annotated suffix tree;
period of a string;
repetition in a string;
D O I:
10.1109/SEQUEN.1997.666917
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
A statistical index for string x is a digital-search tree or trie that returns, for any query string w and in a number of comparisons bounded by the length of w, the number of occurrences of w in x. Clever algorithms are available that support the construction and weighting of such indices in time and space linear in the length of x. This paper addresses the problem of annotating a statistical index with such parameters as the expected value and variance of the number of occurrences of each substring.