Annotated statistical indices for sequence analysis

被引:5
作者
Apostolico, A [1 ]
Bock, ME [1 ]
Xu, XY [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
来源
COMPRESSION AND COMPLEXITY OF SEQUENCES 1997 - PROCEEDINGS | 1998年
关键词
design and analysis of algorithms; combinatorics on strings; pattern matching; substring statistics; suffix tree; annotated suffix tree; period of a string; repetition in a string;
D O I
10.1109/SEQUEN.1997.666917
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A statistical index for string x is a digital-search tree or trie that returns, for any query string w and in a number of comparisons bounded by the length of w, the number of occurrences of w in x. Clever algorithms are available that support the construction and weighting of such indices in time and space linear in the length of x. This paper addresses the problem of annotating a statistical index with such parameters as the expected value and variance of the number of occurrences of each substring.
引用
收藏
页码:215 / 229
页数:15
相关论文
empty
未找到相关数据