A Rapid Bootstrap Algorithm for the RAxML Web Servers

被引:6600
作者
Stamatakis, Alexandros [1 ]
Hoover, Paul [2 ]
Rougemont, Jacques [3 ]
机构
[1] Univ Munich, Dept Comp Sci, Teaching & Res Unit Bioinformat, Exelixis Lab, D-80333 Munich, Germany
[2] San Diego Supercomp Ctr, La Jolla, CA USA
[3] Ecole Polytech Fed Lausanne, Sch Life Sci, Lausanne, Switzerland
关键词
Maximum likelihood; phylogenetic inference; rapid bootstrap; RAxML; support values;
D O I
10.1080/10635150802429642
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite recent advances achieved by application of high-performance computing methods and novel algorithmic techniques to maximum likelihood (ML)-based inference programs, the major computational bottleneck still consists in the computation of bootstrap support values. Conducting a probably insufficient number of 100 bootstrap (BS) analyses with current ML programs on large datasetseither with respect to the number of taxa or base pairscan easily require a month of run time. Therefore, we have developed, implemented, and thoroughly tested rapid bootstrap heuristics in RAxML (Randomized Axelerated Maximum Likelihood) that are more than an order of magnitude faster than current algorithms. These new heuristics can contribute to resolving the computational bottleneck and improve current methodology in phylogenetic analyses. Computational experiments to assess the performance and relative accuracy of these heuristics were conducted on 22 diverse DNA and AA (amino acid), single gene as well as multigene, real-world alignments containing 125 up to 7764 sequences. The standard BS (SBS) and rapid BS (RBS) values drawn on the best-scoring ML tree are highly correlated and show almost identical average support values. The weighted RF (Robinson-Foulds) distance between SBS- and RBS-based consensus trees was smaller than 6% in all cases (average 4%). More importantly, RBS inferences are between 8 and 20 times faster (average 14.73) than SBS analyses with RAxML and between 18 and 495 times faster than BS analyses with competing programs, such as PHYML or GARLI. Moreover, this performance improvement increases with alignment size. Finally, we have set up two freely accessible Web servers for this significantly improved version of RAxML that provide access to the 200-CPU cluster of the Vital-IT unit at the Swiss Institute of Bioinformatics and the 128-CPU cluster of the CIPRES project at the San Diego Supercomputer Center. These Web servers offer the possibility to conduct large-scale phylogenetic inferences to a large part of the community that does not have access to, or the expertise to use, high-performance computing resources.
引用
收藏
页码:758 / 771
页数:14
相关论文
共 41 条
[1]   Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative [J].
Anisimova, Maria ;
Gascuel, Olivier .
SYSTEMATIC BIOLOGY, 2006, 55 (04) :539-552
[2]  
[Anonymous], 2002, DATA STRUCTURES NEAR
[3]  
[Anonymous], 2006, GARLI GENETIC ALGORI
[4]  
Blagojevic F, 2007, PROCEEDINGS OF THE 2007 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING PPOPP'07, P90
[5]  
Charalambous M, 2005, LECT NOTES COMPUT SC, V3746, P415
[6]   Maximum likelihood of evolutionary trees: hardness and approximation [J].
Chor, B ;
Tuller, T .
BIOINFORMATICS, 2005, 21 :I97-I106
[7]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072
[8]   Broad phylogenomic sampling improves resolution of the animal tree of life [J].
Dunn, Casey W. ;
Hejnol, Andreas ;
Matus, David Q. ;
Pang, Kevin ;
Browne, William E. ;
Smith, Stephen A. ;
Seaver, Elaine ;
Rouse, Greg W. ;
Obst, Matthias ;
Edgecombe, Gregory D. ;
Sorensen, Martin V. ;
Haddock, Steven H. D. ;
Schmidt-Rhaesa, Andreas ;
Okusu, Akiko ;
Kristensen, Reinhardt Mobjerg ;
Wheeler, Ward C. ;
Martindale, Mark Q. ;
Giribet, Gonzalo .
NATURE, 2008, 452 (7188) :745-U5
[9]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[10]  
FELSENSTEIN J, 1985, EVOLUTION, V39, P783, DOI 10.1111/j.1558-5646.1985.tb00420.x