Bayesian selection of continuous-time Markov chain evolutionary models

被引:659
作者
Suchard, MA
Weiss, RE
Sinsheimer, JS [1 ]
机构
[1] Univ Calif Los Angeles, Sch Med, Dept Human Genet, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Sch Med, Dept Biomath, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Sch Publ Hlth, Dept Biostat, Los Angeles, CA 90095 USA
关键词
phylogenetics; Markov chain Monte Carlo; nested hypothesis testing; Bayes factors;
D O I
10.1093/oxfordjournals.molbev.a003872
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We develop a reversible jump Markov chain Monte Carlo approach to estimating the posterior distribution of phylogenies based on aligned DNA/RNA sequences under several hierarchical evolutionary models. Using a proper, yet nontruncated and uninformative prior., we demonstrate the advantages of the Bayesian approach to hypothesis testing and estimation in phylogenetics by comparing different models for the infinitesimal rates of change among nucleotides, for the number of rate classes, and for the relationships among branch lengths. We compare the relative probabilities of these models and the appropriateness of a molecular clock using Bayes factors. Our most general model, first proposed by Tamura and Nei, parameterizes the infinitesimal change probabilities among nucleotides (A, G, C, T/U) into six parameters, consisting of three parameters for the nucleotide stationary distribution, two rate parameters for nucleotide transitions, and another parameter for nucleotide transversions. Nested models include the Hasegawa, Kishino, and Yano model with equal transition sates and the Kimura model with a uniform stationary distribution and equal transition rates. To illustrate our methods, we examine simulated data, 16S rRNA sequences from 15 contemporary eubacteria, halobacteria, eocytes, and eukaryotes, 9 primates, and the entire HIV genome of 11 isolates. We find that the Kimura model is too restrictive, that the Hasegawa, Kishino, and Yano model can be rejected for some data sets, that there is evidence for more than one rate class and a molecular clock among similar taxa, and that a molecular clock can be rejected for more distantly related taxa.
引用
收藏
页码:1001 / 1013
页数:13
相关论文
共 56 条
[21]  
Jukes T. H., 1969, MAMMALIAN PROTEIN ME, P121, DOI DOI 10.1016/B978-1-4832-3211-9.50009-7
[22]   BAYES FACTORS [J].
KASS, RE ;
RAFTERY, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (430) :773-795
[24]  
KORBER B, 1997, HUMAN RETROVIRUSES A
[25]  
Kuhner MK, 1998, GENETICS, V149, P429
[26]  
KUHNER MK, 1995, GENETICS, V140, P1421
[27]   ORIGIN OF THE EUKARYOTIC NUCLEUS DETERMINED BY RATE-INVARIANT ANALYSIS OF RIBOSOMAL-RNA SEQUENCES [J].
LAKE, JA .
NATURE, 1988, 331 (6152) :184-186
[28]  
LANGE K, 1997, MATH STAT METHODS GE
[29]   Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees [J].
Larget, B ;
Simon, DL .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (06) :750-759
[30]   Accurate reconstruction of a known HIV-1 transmission history by phylogenetic tree analysis [J].
Leitner, T ;
Escanilla, D ;
Franzen, C ;
Uhlen, M ;
Albert, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (20) :10864-10869