Local sequence alignments with monotonic gap penalties

被引:20
作者
Mott, R [1 ]
机构
[1] SmithKline Beecham Pharmaceut, Res & Dev, Harlow CM19 5AW, Essex, England
关键词
D O I
10.1093/bioinformatics/15.6.455
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Sequence alignments obtained using affine gap penalties are not always biologically correct, because the insertion of long gaps is over-penalised. There is a need for an efficient algorithm which can find local alignments using non-linear gap penalties. Results: A dynamic programming algorithm is described which computes optimal focal sequence alignments for arbitrary, monotonically increasing gap penalties, i.e. where the cost g(k) of inserting a gap of k symbols is such that g(k) greater than or equal to g(k - 1). The running time of the algorithm is dependent on the scoring scheme; if the expected score of art alignment between random, unrelated sequences of lengths m, n is proportional to logmn, then with one exception, the algorithm has expected running time O(mn). Elsewhere, the running time is no greater than O(mn(m + n)). Optimisations are described which appear to reduce the worst-case run-time to O(mn) in many cases. We show how using a non-affine gap penalty cart dramatically increase the probability of detecting a similarity containing a long gap. Availability: The source code is available to academic collaborators under licence.
引用
收藏
页码:455 / 462
页数:8
相关论文
共 26 条
[1]  
Altschul SF, 1998, PROTEINS, V32, P88, DOI 10.1002/(SICI)1097-0134(19980701)32:1<88::AID-PROT10>3.3.CO
[2]  
2-X
[3]  
Altschul SF, 1996, METHOD ENZYMOL, V266, P460
[4]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[5]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[6]   STOCHASTIC SCRABBLE - LARGE DEVIATIONS FOR SEQUENCES WITH SCORES [J].
ARRATIA, R ;
MORRIS, P ;
WATERMAN, MS .
JOURNAL OF APPLIED PROBABILITY, 1988, 25 (01) :106-119
[7]   EMPIRICAL AND STRUCTURAL MODELS FOR INSERTIONS AND DELETIONS IN THE DIVERGENT EVOLUTION OF PROTEINS [J].
BENNER, SA ;
COHEN, MA ;
GONNET, GH .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 229 (04) :1065-1082
[8]   SPEEDING UP DYNAMIC-PROGRAMMING WITH APPLICATIONS TO MOLECULAR-BIOLOGY [J].
GALIL, Z ;
GIANCARLO, R .
THEORETICAL COMPUTER SCIENCE, 1989, 64 (01) :107-118
[9]  
GOTOH O, 1990, B MATH BIOL, V52, P359, DOI 10.1016/S0092-8240(05)80216-2
[10]   AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES [J].
GOTOH, O .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) :705-708