Hide and vanish: Data sets where the most parsimonious tree is known but hard to find, and their implications for tree search methods

被引:9
作者
Goloboff, Pablo A. [1 ]
机构
[1] Consejo Nacl Invest Cient & Tecn, Fdn Miguel Lillo, RA-4000 San Miguel De Tucuman, Argentina
关键词
Parsimony; Tree searches; Homoplasy; Tree islands; PHYLOGENETIC ANALYSIS; SEQUENCES;
D O I
10.1016/j.ympev.2014.06.008
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Three different types of data sets, for which the uniquely most parsimonious tree can be known exactly but is hard to find with heuristic tree search methods, are studied. Tree searches are complicated more by the shape of the tree landscape (i.e. the distribution of homoplasy on different trees) than by the sheer abundance of homoplasy or character conflict. Data sets of Type 1 are those constructed by Radel et al. (2013). Data sets of Type 2 present a very rugged landscape, with narrow peaks and valleys, but relatively low amounts of homoplasy. For such a tree landscape, subjecting the trees to TBR and saving suboptimal trees produces much better results when the sequence of clipping for the tree branches is randomized instead of fixed. An unexpected finding for data sets of Types 1 and 2 is that starting a search from a random tree instead of a random addition sequence Wagner tree may increase the probability that the search finds the most parsimonious tree; a small artificial example where these probabilities can be calculated exactly is presented. Data sets of Type 3, the most difficult data sets studied here, comprise only congruent characters, and a single island with only one most parsimonious tree. Even if there is a single island, missing entries create a very flat landscape which is difficult to traverse with tree search algorithms because the number of equally parsimonious trees that need to be saved and swapped to effectively move around the plateaus is too large. Minor modifications of the parameters of tree drifting, ratchet, and sectorial searches allow travelling around these plateaus much more efficiently than saving and swapping large numbers of equally parsimonious trees with TBR. For these data sets, two new related criteria for selecting taxon addition sequences in Wagner trees (the "selected" and "informative" addition sequences) produce much better results than the standard random or closest addition sequences. These new methods for Wagner trees and for moving around plateaus can be useful when analyzing phylogenomic data sets formed by concatenation of genes with uneven taxon representation ("sparse" supermatrices), which are likely to present a tree landscape with extensive plateaus. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:118 / 131
页数:14
相关论文
共 27 条
[1]  
[Anonymous], 2001, PAUP PHYLOGENETIC AN
[2]  
[Anonymous], 2004, Inferring phylogenies
[3]  
[Anonymous], ANN COMBINATORICS
[4]   Algorithmic aspects of tree amalgamation [J].
Böcker, S ;
Bryant, D ;
Dress, AWM ;
Steel, MA .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2000, 37 (02) :522-537
[5]   On the Number of Binary Characters Needed to Recover a Phylogeny Using Maximum Parsimony [J].
Chai, Juanjuan ;
Housworth, Elizabeth Ann .
BULLETIN OF MATHEMATICAL BIOLOGY, 2011, 73 (06) :1398-1411
[6]   PHYLOGENETICS OF SEED PLANTS - AN ANALYSIS OF NUCLEOTIDE-SEQUENCES FROM THE PLASTID GENE RBCL [J].
CHASE, MW ;
SOLTIS, DE ;
OLMSTEAD, RG ;
MORGAN, D ;
LES, DH ;
MISHLER, BD ;
DUVALL, MR ;
PRICE, RA ;
HILLS, HG ;
QIU, YL ;
KRON, KA ;
RETTIG, JH ;
CONTI, E ;
PALMER, JD ;
MANHART, JR ;
SYTSMA, KJ ;
MICHAELS, HJ ;
KRESS, WJ ;
KAROL, KG ;
CLARK, WD ;
HEDREN, M ;
GAUT, BS ;
JANSEN, RK ;
KIM, KJ ;
WIMPEE, CF ;
SMITH, JF ;
FURNIER, GR ;
STRAUSS, SH ;
XIANG, QY ;
PLUNKETT, GM ;
SOLTIS, PS ;
SWENSEN, SM ;
WILLIAMS, SE ;
GADEK, PA ;
QUINN, CJ ;
EGUIARTE, LE ;
GOLENBERG, E ;
LEARN, GH ;
GRAHAM, SW ;
BARRETT, SCH ;
DAYANANDAN, S ;
ALBERT, VA .
ANNALS OF THE MISSOURI BOTANICAL GARDEN, 1993, 80 (03) :528-580
[7]  
Farris J.S., 1970, Syst. Zool, V34, P21
[8]   THE RETENTION INDEX AND THE RESCALED CONSISTENCY INDEX [J].
FARRIS, JS .
CLADISTICS-THE INTERNATIONAL JOURNAL OF THE WILLI HENNIG SOCIETY, 1989, 5 (04) :417-419
[9]  
Foulds L.R., 1982, Advances in Applied Mathematics, V3, P43, DOI [10.1016/S0196-8858(82)80004-3, DOI 10.1016/S0196-8858(82)80004-3]
[10]  
Goloboff P.A., 2013, CLADISTICS