Phylogenomics of eukaryotes: Impact of missing data on large alignments

被引:327
作者
Philippe, H [1 ]
Snell, EA
Bapteste, E
Lopez, P
Holland, PWH
Casane, D
机构
[1] Univ Reading, Sch Anim & Microbial Sci, Reading RG6 2AJ, Berks, England
[2] Univ Paris 06, Paris, France
[3] Univ Oxford, Dept Zool, Oxford OX1 3PS, England
关键词
molecular phylogeny; multi-gene analysis; missing data; choanoflagellata;
D O I
10.1093/molbev/msh182
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.
引用
收藏
页码:1740 / 1752
页数:13
相关论文
共 105 条
[81]   Genome-scale approaches to resolving incongruence in molecular phylogenies [J].
Rokas, A ;
Williams, BL ;
King, N ;
Carroll, SB .
NATURE, 2003, 425 (6960) :798-804
[82]   Intron insertion as a phylogenetic character:: the engrailed homeobox of Strepsiptera does not indicate affinity with Diptera [J].
Rokas, A ;
Kathirithamby, J ;
Holland, PWH .
INSECT MOLECULAR BIOLOGY, 1999, 8 (04) :527-530
[83]   Incomplete taxon sampling is not a problem for phylogenetic inference [J].
Rosenberg, MS ;
Kumar, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (19) :10751-10756
[84]   A phylogenetic analysis of myosin heavy chain type II sequences corroborates that Acoela and Nemertodermatida are basal bilaterians [J].
Ruiz-Trillo, I ;
Paps, J ;
Loukota, M ;
Ribera, C ;
Jondelius, U ;
Baguñà, J ;
Riutort, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (17) :11246-11251
[85]   Obtaining maximal concatenated phylogenetic data sets from large sequence databases [J].
Sanderson, MJ ;
Driskell, AC ;
Ree, RH ;
Eulenstein, O ;
Langley, S .
MOLECULAR BIOLOGY AND EVOLUTION, 2003, 20 (07) :1036-1042
[86]   Eukaryotic evolution - Early origin of canonical introns [J].
Simpson, AGB ;
MacQuarrie, EK ;
Roger, AJ .
NATURE, 2002, 419 (6904) :270-270
[87]   Eukaryotic evolution: Getting to the root of the problem [J].
Simpson, AGB ;
Roger, AJ .
CURRENT BIOLOGY, 2002, 12 (20) :R691-R693
[88]   Hsp70 sequences indicate that choanoflagellates are closely related to animals [J].
Snell, EA ;
Furlong, RF ;
Holland, PWH .
CURRENT BIOLOGY, 2001, 11 (12) :967-970
[89]   Rooting the eukaryote tree by using a derived gene fusion [J].
Stechmann, A ;
Cavalier-Smith, T .
SCIENCE, 2002, 297 (5578) :89-91
[90]   Quartet puzzling: A quartet maximum-likelihood method for reconstructing tree topologies [J].
Strimmer, K ;
vonHaeseler, A .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (07) :964-969