Parallel algorithms for Bayesian phylogenetic inference

被引:29
作者
Feng, XZ
Buell, DA [1 ]
Rose, JR
Waddell, PJ
机构
[1] Univ S Carolina, Dept Comp Sci & Engn, Columbia, SC 29208 USA
[2] Univ S Carolina, Dept Stat, Columbia, SC 29208 USA
[3] Univ S Carolina, Dept Biol Sci, Columbia, SC 29208 USA
关键词
phylogenetic inference; MCMC; distributed parallel computation;
D O I
10.1016/S0743-7315(03)00079-0
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The combination of a Markov chain Monte Carlo (MCMC) method with likelihood-based assessment of phylogenies is becoming a popular alternative to direct likelihood optimization. However, MCMC, like maximum likelihood, is a computationally expensive method. To approximate the posterior distribution of phylogenies, a Markov chain is constructed, using the Metropolis algorithm, such that the chain has the posterior distribution of the parameters of phylogenies as its stationary distribution. This paper describes parallel algorithms and their MPI-based parallel implementation for MCMC-based Bayesian phylogenetic inference. Bayesian phylogenetic inference is computationally expensive both in time and in memory requirements. Our variations on MCMC and their implementation were done to permit the study of large phylogenetic problems. In our approach, we can distribute either entire chains or parts of a chain to different processors, since in current models the columns of the data are independent. Evaluations on a 32-node Beowulf cluster suggest the problem scales well. A number of important points are identified, including a superlinear speedup due to more effective cache usage and the point at which additional processors slow down the process due to communication overhead. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:707 / 718
页数:12
相关论文
共 28 条
[1]  
ALTEKAR G, 2002, TR784 U ROCH DEP COM
[2]  
[Anonymous], MOL SYSTEMATICS
[3]   Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability [J].
Douady, CJ ;
Delsuc, F ;
Boucher, Y ;
Doolittle, WF ;
Douzery, EJP .
MOLECULAR BIOLOGY AND EVOLUTION, 2003, 20 (02) :248-254
[4]  
DURBIN R, 1997, BIOL SEQUENCE ANAL P
[5]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[6]  
GEYER CJ, 1991, COMPUTING SCIENCE AND STATISTICS, P156
[7]  
Green PJ, 1995, BIOMETRIKA, V82, P711, DOI 10.2307/2337340
[8]   MONTE-CARLO SAMPLING METHODS USING MARKOV CHAINS AND THEIR APPLICATIONS [J].
HASTINGS, WK .
BIOMETRIKA, 1970, 57 (01) :97-&
[9]   MRBAYES: Bayesian inference of phylogenetic trees [J].
Huelsenbeck, JP ;
Ronquist, F .
BIOINFORMATICS, 2001, 17 (08) :754-755
[10]   EVALUATION OF THE MAXIMUM-LIKELIHOOD ESTIMATE OF THE EVOLUTIONARY TREE TOPOLOGIES FROM DNA-SEQUENCE DATA, AND THE BRANCHING ORDER IN HOMINOIDEA [J].
KISHINO, H ;
HASEGAWA, M .
JOURNAL OF MOLECULAR EVOLUTION, 1989, 29 (02) :170-179