Terrace Aware Data Structure for Phylogenomic Inference from Supermatrices

被引:1547
作者
Chernomor, Olga [1 ]
von Haeseler, Arndt [1 ,2 ]
Bui Quang Minh [1 ]
机构
[1] Med Univ Vienna, Univ Vienna, Ctr Integrat Bioinformat Vienna, Max F Perutz Labs, A-1030 Vienna, Austria
[2] Univ Vienna, Bioinformat & Computat Biol, Fac Comp Sci, A-1090 Vienna, Austria
基金
奥地利科学基金会;
关键词
Maximum likelihood; partial terraces; partition models; phylogenetic terraces; phylogenomic inference; MAXIMUM-LIKELIHOOD PHYLOGENIES; CARNIVORA MAMMALIA; DATA SETS; TREES; LIFE; RECONSTRUCTION; PERFORMANCE; HETEROTACHY; ALIGNMENTS; ALGORITHMS;
D O I
10.1093/sysbio/syw037
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In phylogenomics the analysis of concatenated gene alignments, the so-called supermatrix, is commonly accompanied by the assumption of partition models. Under such models each gene, or more generally partition, is allowed to evolve under its own evolutionarymodel. Although partition models provide amore comprehensive analysis of supermatrices, missing data may hamper the tree search algorithms due to the existence of phylogenetic (partial) terraces. Here, we introduce the phylogenetic terrace aware (PTA) data structure for the efficient analysis under partition models. In the presence of missing data PTA exploits (partial) terraces and induced partition trees to save computation time. We show that an implementation of PTA in IQ-TREE leads to a substantial speedup of up to 4.5 and 8 times compared with the standardIQ-TREEandRAxMLimplementations, respectively. PTAis generally applicable to all types of partition models and common topological rearrangements thus can be employed by all phylogenomic inference software.
引用
收藏
页码:997 / 1008
页数:12
相关论文
共 42 条
[11]   Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis [J].
Eisen, JA .
GENOME RESEARCH, 1998, 8 (03) :163-167
[12]   Patterns of macroevolution among Primates inferred from a supermatrix of mitochondrial and nuclear DNA [J].
Fabre, P-H. ;
Rodrigues, A. ;
Douzery, E. J. P. .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2009, 53 (03) :808-825
[13]   The Phylogenetic Likelihood Library [J].
Flouri, T. ;
Izquierdo-Carrasco, F. ;
Darriba, D. ;
Aberer, A. J. ;
Nguyen, L. -T. ;
Minh, B. Q. ;
Von Haeseler, A. ;
Stamatakis, A. .
SYSTEMATIC BIOLOGY, 2015, 64 (02) :356-362
[14]   New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0 [J].
Guindon, Stephane ;
Dufayard, Jean-Francois ;
Lefort, Vincent ;
Anisimova, Maria ;
Hordijk, Wim ;
Gascuel, Olivier .
SYSTEMATIC BIOLOGY, 2010, 59 (03) :307-321
[15]   MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics [J].
Helaers, RaphaeL ;
Milinkovitch, Michel C. .
BMC BIOINFORMATICS, 2010, 11
[16]   Using Supermatrices for Phylogenetic Inquiry: An Example Using the Sedges [J].
Hinchliff, Cody E. ;
Roalson, Eric H. .
SYSTEMATIC BIOLOGY, 2013, 62 (02) :205-219
[17]   Algorithms, data structures, and numerics for likelihood-based phylogenetic inference of huge trees [J].
Izquierdo-Carrasco, Fernando ;
Smith, Stephen A. ;
Stamatakis, Alexandros .
BMC BIOINFORMATICS, 2011, 12
[18]   RETRACTED: TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics - art. no. 18 (Retracted article. See vol. 15, 243, 2015) [J].
Jobb, G ;
von Haeseler, A ;
Strimmer, K .
BMC EVOLUTIONARY BIOLOGY, 2004, 4 (1)
[19]  
Kobert K, 2014, LECT N BIOINFORMAT, V8701, P204, DOI 10.1007/978-3-662-44753-6_16
[20]   Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous [J].
Kolaczkowski, B ;
Thornton, JW .
NATURE, 2004, 431 (7011) :980-984