Bayesian adaptive sequence alignment algorithms

被引:78
作者
Zhu, J
Liu, JS
Lawrence, CE [1 ]
机构
[1] New York State Dept Hlth, Wadsworth Ctr Labs & Res, Biometr Lab, Albany, NY 12201 USA
[2] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
关键词
D O I
10.1093/bioinformatics/14.1.25
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The selection of a scoring matrix and gap penalty parameters. continues to be an important problem in sequence alignment. We describe here an algorithm, the 'Bayes block aligner; which bypasses this requirement. Instead of requiring a fixed set of parameter settings, this algorithm returns the Bayesian posterior probability for the number of gaps and for the scoring matrices in any sei ies of interest. Furthermore, instead of returning the single best alignment for. the chosen parameter settings, this algorithm returns the posterior distribution of all alignments considering the full range of gapping and scoring matrices selected, weighing each in proportion to its probability based on the data. We compared the Bayes aligner with the popular Smith-Waterman algorithm with parameter settings fi-om the literature which had been optimized for the identification of structural neighbors, and found that the Bayes aligner correctly identified more structural neighbours. In a detailed examination of the alignment of a pail of Kinase and a pair of GTPase sequences, Mie illustrate the algorithm's potential to identify subsequences that ai-e conserved to different degrees. In addition, this example shows that the Bayes aligner returns an alignment-free assessment of the distance between a pail of sequences. Availability: Software is available at http://www.wadsworth.org/res&res/bioinfo/ Contact: junzhu, lawrence@wadsworth.org, jliu@stat.stanford.edu.
引用
收藏
页码:25 / 39
页数:15
相关论文
共 46 条
[1]   A Bayesian evolutionary distance for parametrically aligned sequences [J].
Agarwal, P ;
States, DJ .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1996, 3 (01) :1-17
[2]   FINITE-STATE MODELS IN THE ALIGNMENT OF MACROMOLECULES [J].
ALLISON, L ;
WALLACE, CS ;
YEE, CN .
JOURNAL OF MOLECULAR EVOLUTION, 1992, 35 (01) :77-89
[3]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[4]   A PROTEIN ALIGNMENT SCORING SYSTEM SENSITIVE AT ALL EVOLUTIONARY DISTANCES [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR EVOLUTION, 1993, 36 (03) :290-300
[5]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[6]  
[Anonymous], ISMB
[7]  
[Anonymous], 1972, ATLAS PROTEIN SEQUEN
[8]   HIDDEN MARKOV-MODELS OF BIOLOGICAL PRIMARY SEQUENCE INFORMATION [J].
BALDI, P ;
CHAUVIN, Y ;
HUNKAPILLER, T ;
MCCLURE, MA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (03) :1059-1063
[9]  
BERGER JO, 1987, J AM STAT ASSOC, V82, P112, DOI 10.2307/2289131
[10]  
BOURNE HR, 1991, NATURE, V349, P117, DOI 10.1038/349117a0