The Use and Validity of Composite Taxa in Phylogenetic Analysis

被引:35
作者
Campbell, Veronique [1 ]
Lapointe, Francois-Joseph [1 ]
机构
[1] Univ Montreal, Dept Sci Biol, Montreal, PQ H3C 3J7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Composite sequences; computer simulations; DNA sequences; missing data; phylogenetic accuracy; phylogenomics; supermatrices; COMBINING DATA SETS; MISSING DATA; CLADISTIC-ANALYSIS; PLACENTAL MAMMALS; MAXIMUM-LIKELIHOOD; ANIMAL PHYLOGENY; INCOMPLETE TAXA; DNA-SEQUENCES; GENE TREES; ACCURACY;
D O I
10.1093/sysbio/syp056
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In phylogenetic analysis, one possible approach to minimize missing data in DNA supermatrices consists in sampling sequences from different species to obtain a complete sequence for all genes included in the study. We refer to those complete sequences as composite taxa because DNA sequences that are combined belong to different species. An alternative approach is to analyze incomplete supermatrices by coding unavailable DNA sequences as missing. The accuracy of phylogenetic trees estimated using matrices that include composite taxa has recently been questioned, and the best approach for analyzing incomplete supermatrices is highly debated. Through computer simulations, we compared the phylogenetic accuracy of the 2 competing approaches. We explored the effect of composite taxa when inferring higher level relationships, that is, relationships between monophyletic groups. DNA sequences were simulated on a 42-taxon model tree and incomplete supermatrices containing different percentages of missing data were generated. These incomplete supermatrices were analyzed either by coding the missing data with "?" or by reducing the amount of missing data through the combination of 2 or more taxa to generate composite taxa. Of 180 comparisons (18 simulation cases with 2 different inference methods and 5 levels of incompleteness), we observed significantly higher phylogenetic accuracies for composite matrices in 46 comparisons, whereas missing data matrices outperformed composites in 8 comparisons. In all other cases, the phylogenetic accuracy obtained with composite matrices was not significantly different from that of missing data matrices. This study demonstrates that composite taxa represent an interesting approach to minimize the amount of missing data in supermatrices and we suggest that it is the optimal approach to use in phylogenomic studies to reduce computing time.
引用
收藏
页码:560 / 572
页数:13
相关论文
共 98 条
[51]   Phylogenetic systematics of the colorful, cyanide-producing millipedes of Appalachia (Polydesmida, Xystodesmidae, Apheloriini) using a total evidence Bayesian approach [J].
Marek, Paul E. ;
Bond, Jason E. .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2006, 41 (03) :704-729
[52]   Resolution of the early placental mammal radiation using Bayesian phylogenetics [J].
Murphy, WJ ;
Eizirik, E ;
O'Brien, SJ ;
Madsen, O ;
Scally, M ;
Douady, CJ ;
Teeling, E ;
Ryder, OA ;
Stanhope, MJ ;
de Jong, WW ;
Springer, MS .
SCIENCE, 2001, 294 (5550) :2348-2351
[53]   Molecular phylogenetics and the origins of placental mammals [J].
Murphy, WJ ;
Eizirik, E ;
Johnson, WE ;
Zhang, YP ;
Ryder, OA ;
O'Brien, SJ .
NATURE, 2001, 409 (6820) :614-618
[54]   Rooting the eutherian tree: the power and pitfalls of phylogenomics [J].
Nishihara, Hidenori ;
Okada, Norihiro ;
Hasegawa, Masami .
GENOME BIOLOGY, 2007, 8 (09)
[55]   POLYMORPHIC TAXA, MISSING VALUES AND CLADISTIC-ANALYSIS [J].
NIXON, KC ;
DAVIS, JI .
CLADISTICS-THE INTERNATIONAL JOURNAL OF THE WILLI HENNIG SOCIETY, 1991, 7 (03) :233-241
[56]   Extracting species trees from complex gene trees: Reconciled trees and vertebrate phylogeny [J].
Page, RDM .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2000, 14 (01) :89-106
[57]   Phylogenomics [J].
Philippe, H ;
Delsuc, F ;
Brinkmann, H ;
Lartillot, N .
ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS, 2005, 36 :541-562
[58]   Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia [J].
Philippe, H ;
Lartillot, N ;
Brinkmann, H .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (05) :1246-1253
[59]   Phylogenomics of eukaryotes: Impact of missing data on large alignments [J].
Philippe, H ;
Snell, EA ;
Bapteste, E ;
Lopez, P ;
Holland, PWH ;
Casane, D .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (09) :1740-1752
[60]   Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics [J].
Philippe, Herve ;
Brinkmann, Henner ;
Martinez, Pedro ;
Riutort, Marta ;
Baguna, Jaume .
PLOS ONE, 2007, 2 (08)