Local sequence alignments with monotonic gap penalties

被引:20
作者
Mott, R [1 ]
机构
[1] SmithKline Beecham Pharmaceut, Res & Dev, Harlow CM19 5AW, Essex, England
关键词
D O I
10.1093/bioinformatics/15.6.455
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Sequence alignments obtained using affine gap penalties are not always biologically correct, because the insertion of long gaps is over-penalised. There is a need for an efficient algorithm which can find local alignments using non-linear gap penalties. Results: A dynamic programming algorithm is described which computes optimal focal sequence alignments for arbitrary, monotonically increasing gap penalties, i.e. where the cost g(k) of inserting a gap of k symbols is such that g(k) greater than or equal to g(k - 1). The running time of the algorithm is dependent on the scoring scheme; if the expected score of art alignment between random, unrelated sequences of lengths m, n is proportional to logmn, then with one exception, the algorithm has expected running time O(mn). Elsewhere, the running time is no greater than O(mn(m + n)). Optimisations are described which appear to reduce the worst-case run-time to O(mn) in many cases. We show how using a non-affine gap penalty cart dramatically increase the probability of detecting a similarity containing a long gap. Availability: The source code is available to academic collaborators under licence.
引用
收藏
页码:455 / 462
页数:8
相关论文
共 26 条
[21]   IDENTIFICATION OF COMMON MOLECULAR SUBSEQUENCES [J].
SMITH, TF ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1981, 147 (01) :195-197
[22]  
Waterman M., 1995, INTRO COMPUTATIONAL
[23]   EFFICIENT SEQUENCE ALIGNMENT ALGORITHMS [J].
WATERMAN, MS .
JOURNAL OF THEORETICAL BIOLOGY, 1984, 108 (03) :333-337
[24]   SEQUENCE COMPARISON SIGNIFICANCE AND POISSON APPROXIMATION [J].
WATERMAN, MS ;
VINGRON, M .
STATISTICAL SCIENCE, 1994, 9 (03) :367-381
[25]  
WATERMAN MS, 1994, P NATL ACAD SCI USA, V91, P4635
[26]  
WATERMAN MS, 1987, P NATL ACAD SCI US, V84, P239