Hidden likelihood support in genomic data: Can forty-five wrongs make a right?

被引:124
作者
Gatesy, J [1 ]
Baker, RH
机构
[1] Univ Calif Riverside, Dept Biol, Riverside, CA 92521 USA
[2] US DOE, Joint Genome Inst, Evolut Genom Dept, Walnut Creek, CA 94598 USA
关键词
D O I
10.1080/10635150590945368
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Combined analysis of multiple phylogenetic data sets can reveal emergent character support that is not evident in separate analyses of individual data sets. Previous parsimony analyses have shown that this hidden support often accounts for a large percentage of the overall phylogenetic signal in cladistic studies. Here, reanalysis of a large comparative genomic data set for yeast ( genus Saccharomyces) demonstrates that hidden support can be an important factor in maximum likelihood analyses of multiple data sets as well. Emergent signal in a concatenation of 106 genes was responsible for up to 64% of the likelihood support at a particular node ( the difference in log likelihood scores between optimal topologies that included and excluded a supported clade). A grouping of four yeast species ( S. cerevisiae, S. paradoxus, S. mikatae, and S. kudriavzevii) was robustly supported by combined analysis of all 106 genes, but separate analyses of individual genes suggested numerous conflicts. Forty-eight genes strictly contradicted S. cerevisiae + S. paradoxus + S. mikatae + S. kudriavzevii in separate analyses, but combined likelihood analyses that included up to 45 of the "wrong" data sets supported this group. Extensive hidden support also emerged in a combined likelihood analysis of 41 genes that each recovered the exact same topology in separate analyses of the individual genes. These results show that isolated analyses of individual data sets can mask congruence and distort interpretations of clade stability, even in strictly model-based phylogenetic methods. Consensus and supertree procedures that ignore hidden phylogenetic signals are, at best, incomplete.
引用
收藏
页码:483 / 492
页数:10
相关论文
共 61 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Multiple sources of character information and the phylogeny of Hawaiian Drosophilids [J].
Baker, RH ;
DeSalle, R .
SYSTEMATIC BIOLOGY, 1997, 46 (04) :654-673
[3]  
Baker RH, 2002, MOLECULAR SYSTEMATICS AND EVOLUTION: THEORY AND PRACTICE, P163
[4]   AGAINST CONSENSUS [J].
BARRETT, M ;
DONOGHUE, MJ ;
SOBER, E .
SYSTEMATIC ZOOLOGY, 1991, 40 (04) :486-493
[5]   The (Super)tree of life: Procedures, problems, and prospects [J].
Bininda-Emonds, ORP ;
Gittleman, JL ;
Steel, MA .
ANNUAL REVIEW OF ECOLOGY AND SYSTEMATICS, 2002, 33 :265-289
[6]  
BREMER K, 1988, EVOLUTION, V42, P795, DOI [10.2307/2408870, 10.1111/j.1558-5646.1988.tb02497.x]
[7]  
BREMER K, 1994, CLADISTICS, V10, P295, DOI 10.1006/clad.1994.1019
[8]   Cladistic analysis of Heliconius butterflies and relatives (Nymphalidae: Heliconiiti): a revised phylogenetic position for Eueides based on sequences from mtDNA and a nuclear gene [J].
Brower, AVZ ;
Egan, MG .
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1997, 264 (1384) :969-977
[9]   PARTITIONING AND COMBINING DATA IN PHYLOGENETIC ANALYSIS [J].
BULL, JJ ;
HUELSENBECK, JP ;
CUNNINGHAM, CW ;
SWOFFORD, DL ;
WADDELL, PJ .
SYSTEMATIC BIOLOGY, 1993, 42 (03) :384-397
[10]  
CAO Y, 1994, J MOL EVOL, V39, P519