Sensitivity analysis and efficient method for identifying optimal spaced seeds

被引:44
作者
Choi, KP
Zhang, LX
机构
[1] Natl Univ Singapore, Dept Math, Singapore 117543, Singapore
[2] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 117543, Singapore
[3] Natl Univ Singapore, Dept Math, Singapore 117543, Singapore
关键词
sequence comparison; pattern matching; filtration technique; spaced seeds; sensitivity analysis; heuristic algorithm;
D O I
10.1016/j.jcss.2003.04.002
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The novel introduction of spaced seed idea in the filtration stage of sequence comparison by Ma et al. (Bioinformatics 18 (2002) 440) has greatly increased the sensitivity of homology search without compromising the speed of search. Finding the optimal spaced seeds is of great importance both theoretically and in designing better search tool for sequence comparison. In this paper, we study the computational aspects of calculating the hitting probability of spaced seeds; and based on these results, we propose an efficient algorithm for identifying optimal spaced seeds. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:22 / 40
页数:19
相关论文
共 23 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
Balakrishnan N., 2002, RUNS SCANS APPL
[5]   Efficient large-scale sequence comparison by locality-sensitive hashing [J].
Buhler, J .
BIOINFORMATICS, 2001, 17 (05) :419-428
[6]  
BURKHARDT S, 2001, P 12 ANN S COMB PATT, P73
[7]  
CALIFANO A, 1995, FLASH FAST LOOK UP A
[8]   Alignment of whole genomes [J].
Delcher, AL ;
Kasif, S ;
Fleischmann, RD ;
Peterson, J ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 1999, 27 (11) :2369-2376
[9]   EFFICIENT ALGORITHMS FOR FOLDING AND COMPARING NUCLEIC-ACID SEQUENCES [J].
DUMAS, JP ;
NINIO, J .
NUCLEIC ACIDS RESEARCH, 1982, 10 (01) :197-206
[10]   DISTRIBUTION-THEORY OF RUNS - A MARKOV-CHAIN APPROACH [J].
FU, JC ;
KOUTRAS, MV .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (427) :1050-1058