Impacts of Terraces on Phylogenetic Inference

被引:34
作者
Sanderson, Michael J. [1 ]
McMahon, Michelle M. [2 ]
Stamatakis, Alexandros [1 ,3 ,4 ]
Zwickl, Derrick J. [1 ]
Steel, Mike [5 ]
机构
[1] Univ Arizona, Dept Ecol & Evolutionary Biol, Tucson, AZ 85721 USA
[2] Univ Arizona, Sch Plant Sci, Tucson, AZ 85721 USA
[3] Heidelberg Inst Theoret Studies, Sci Comp Grp, D-69118 Heidelberg, Germany
[4] Karlsruhe Inst Technol, Inst Theoret Informat, D-76131 Karlsruhe, Germany
[5] Univ Canterbury, Biomath Res Ctr, Christchurch 1, New Zealand
基金
美国国家科学基金会;
关键词
Bootstrap; partitioned model; phylogenetics; posterior probability; terrace; MISSING DATA; BOOTSTRAP SUPPORT; TREE; LIKELIHOOD; EVOLUTION; COMPLEXITY; CHOICE; RECONSTRUCTION; SUPERMATRICES; PHYLOGENOMICS;
D O I
10.1093/sysbio/syv024
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Terraces are sets of trees with precisely the same likelihood or parsimony score, which can be induced by missing sequences in partitioned multi-locus phylogenetic data matrices. The potentially large set of trees on a terrace can be characterized by enumeration algorithms or consensus methods that exploit the pattern of partial taxon coverage in the data, independent of the sequence data themselves. Terraces can add ambiguity and complexity to phylogenetic inference, particularly in settings where inference is already challenging: data sets with many taxa and relatively few loci. In this article we present five new findings about terraces and their impacts on phylogenetic inference. First, we clarify assumptions about partitioning scheme model parameters that are necessary for the existence of terraces. Second, we explore the dependence of terrace size on partitioning scheme and indicate how to find the partitioning scheme associated with the largest terrace containing a given tree. Third, we highlight the impact of terrace size on bootstrap estimates of confidence limits in clades, and characterize the surprising result that the bootstrap proportion for a clade, as it is usually calculated, can be entirely determined by the frequency of bipartitions on a terrace, with some bipartitions receiving high support even when incorrect. Fourth, we dissect some effects of prior distributions of edge lengths on the computed posterior probabilities of clades on terraces, to understand an example in which long edges "attract" each other in Bayesian inference. Fifth, we describe how assuming relationships between edge-lengths of different loci, as an attempt to avoid terraces, can also be problematic when taxon coverage is partial, specifically when heterotachy is present. Finally, we discuss strategies for remediation of some of these problems. One promising approach finds a minimal set of taxa which, when deleted from the data matrix, reduces the size of a terrace to a single tree.
引用
收藏
页码:709 / 726
页数:18
相关论文
共 85 条
[1]   INFERRING A TREE FROM LOWEST COMMON ANCESTORS WITH AN APPLICATION TO THE OPTIMIZATION OF RELATIONAL EXPRESSIONS [J].
AHO, AV ;
SAGIV, Y ;
SZYMANSKI, TG ;
ULLMAN, JD .
SIAM JOURNAL ON COMPUTING, 1981, 10 (03) :405-421
[2]   Groves of Phylogenetic Trees [J].
Ane, Cecile ;
Eulenstein, Oliver ;
Piaggio-Talice, Raul ;
Sanderson, Michael J. .
ANNALS OF COMBINATORICS, 2009, 13 (02) :139-167
[3]  
[Anonymous], 2014, BIOINFORMATICS
[4]   IS YOUR PHYLOGENY INFORMATIVE? MEASURING THE POWER OF COMPARATIVE METHODS [J].
Boettiger, Carl ;
Coop, Graham ;
Ralph, Peter .
EVOLUTION, 2012, 66 (07) :2240-2251
[5]  
Bordewich M, 2003, THESIS U OXFORD
[6]  
Bryant D., 1997, PhD thesis
[7]   Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms [J].
Burleigh, J. Gordon ;
Hilu, Khidir W. ;
Soltis, Douglas E. .
BMC EVOLUTIONARY BIOLOGY, 2009, 9
[8]   Does phylogeny matter? Assessing the impact of phylogenetic information in ecological meta-analysis [J].
Chamberlain, Scott A. ;
Hovick, Stephen M. ;
Dibble, Christopher J. ;
Rasmussen, Nick L. ;
Van Allen, Benjamin G. ;
Maitner, Brian S. ;
Ahern, Jeffrey R. ;
Bell-Dereske, Lukas P. ;
Roy, Christopher L. ;
Meza-Lopez, Maria ;
Carrillo, Juli ;
Siemann, Evan ;
Lajeunesse, Marc J. ;
Whitney, Kenneth D. .
ECOLOGY LETTERS, 2012, 15 (06) :627-636
[9]   Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency [J].
Chang, JT .
MATHEMATICAL BIOSCIENCES, 1996, 137 (01) :51-73
[10]   Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)? [J].
Cho, Soowon ;
Zwick, Andreas ;
Regier, Jerome C. ;
Mitter, Charles ;
Cummings, Michael P. ;
Yao, Jianxiu ;
Du, Zaile ;
Zhao, Hong ;
Kawahara, Akito Y. ;
Weller, Susan ;
Davis, Donald R. ;
Baixeras, Joaquin ;
Brown, John W. ;
Parr, Cynthia .
SYSTEMATIC BIOLOGY, 2011, 60 (06) :782-796