Inferring species phylogenies from multiple genes: Concatenated sequence tree versus consensus gene tree

被引:327
作者
Gadagkar, SR [1 ]
Rosenberg, MS
Kumar, S
机构
[1] Arizona State Univ, Biodesign Inst, Sch Life Sci, Tempe, AZ 85287 USA
[2] Arizona State Univ, Biodesign Inst, Ctr Evolutionary Funct Genom, Tempe, AZ 85287 USA
[3] Univ Dayton, Dept Biol, Dayton, OH USA
关键词
D O I
10.1002/jez.b.21026
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phylogenetic trees from multiple genes can be obtained in two fundamentally different ways. In one, gene sequences are concatenated into a super-gene alignment, which is then analyzed to generate the species tree. In the other, phylogenies are inferred separately from each gene, and a consensus of these gene phylogenies is used to represent the species tree. Here, we have compared these two approaches by means of computer simulation, using 448 parameter sets, including evolutionary rate, sequence length, base composition, and transition/transversion rate bias. In these simulations, we emphasized a worst-case scenario analysis in which 100 replicate datasets for each evolutionary parameter set (gene) were generated, and the replicate dataset that produced a tree topology showing the largest number of phylogenetic errors was selected to represent that parameter set. Both randomly selected and worst-case replicates were utilized to compare the consensus and concatenation approaches primarily using the neighbor-joining (NJ) method. We find that the concatenation approach yields more accurate trees, even when the sequences concatenated have evolved with very different substitution patterns and no attempts are made to accommodate these differences while inferring phylogenies. These results appear to hold true for parsimony and likelihood methods as well. The concatenation approach shows >95% accuracy with only 10 genes. However, this gain in accuracy is sometimes accompanied by reinforcement of certain systematic biases, resulting in spuriously high bootstrap support for incorrect partitions, whether we employ site, gene, or a combined bootstrap resampling approach. Therefore, it will be prudent to report the number of individual genes supporting an inferred clade in the concatenated sequence tree, in addition to the bootstrap support. (C) 2005 Wiley-Liss, Inc.
引用
收藏
页码:64 / 74
页数:11
相关论文
共 41 条
[1]   A new phylogenetic marker, apolipoprotein B, provides compelling evidence for eutherian relationships [J].
Amrine-Madsen, H ;
Koepfli, KP ;
Wayne, RK ;
Springer, MS .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2003, 28 (02) :225-240
[2]   A search for the origins of animals and fungi: Comparing and combining molecular data [J].
Baldauf, SL .
AMERICAN NATURALIST, 1999, 154 :S178-S188
[3]   AGAINST CONSENSUS [J].
BARRETT, M ;
DONOGHUE, MJ ;
SOBER, E .
SYSTEMATIC ZOOLOGY, 1991, 40 (04) :486-493
[4]   PARTITIONING AND COMBINING DATA IN PHYLOGENETIC ANALYSIS [J].
BULL, JJ ;
HUELSENBECK, JP ;
CUNNINGHAM, CW ;
SWOFFORD, DL ;
WADDELL, PJ .
SYSTEMATIC BIOLOGY, 1993, 42 (03) :384-397
[5]   Molecular systematics of armadillos (Xenarthra, Dasypodidae): contribution of maximum likelihood and Bayesian analyses of mitochondrial and nuclear genes [J].
Delsuc, F ;
Stanhope, MJ ;
Douzery, EJP .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2003, 28 (02) :261-275
[6]   SEPARATE VERSUS COMBINED ANALYSIS OF PHYLOGENETIC EVIDENCE [J].
DEQUEIROZ, A ;
DONOGHUE, MJ ;
KIM, J .
ANNUAL REVIEW OF ECOLOGY AND SYSTEMATICS, 1995, 26 :657-681
[7]  
DEQUEIROZ A, 1993, SYST BIOL, V42, P368
[8]   INTEGRATION OF MORPHOLOGICAL AND RIBOSOMAL-RNA DATA ON THE ORIGIN OF ANGIOSPERMS [J].
DOYLE, JA ;
DONOGHUE, MJ ;
ZIMMER, EA .
ANNALS OF THE MISSOURI BOTANICAL GARDEN, 1994, 81 (03) :419-450
[9]   GENE TREES AND SPECIES TREES - MOLECULAR SYSTEMATICS AS ONE-CHARACTER TAXONOMY [J].
DOYLE, JJ .
SYSTEMATIC BOTANY, 1992, 17 (01) :144-163
[10]   HOVERGEN - A DATABASE OF HOMOLOGOUS VERTEBRATE GENES [J].
DURET, L ;
MOUCHIROUD, D ;
GOUY, M .
NUCLEIC ACIDS RESEARCH, 1994, 22 (12) :2360-2365