Prospects for building the tree of life from large sequence databases

被引:198
作者
Driskell, AC
Ané, C
Burleigh, JG
McMahon, MM
O'Meara, BC
Sanderson, MJ
机构
[1] Univ Calif Davis, Sect Evolut & Ecol, Davis, CA 95616 USA
[2] Univ Calif Davis, Ctr Populat Biol, Davis, CA 95616 USA
关键词
D O I
10.1126/science.1102036
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We assess the phylogenetic potential of similar to300,000 protein sequences sampled from Swiss-Prot and GenBank. Although only a small subset of these data was potentially phylogenetically informative, this subset retained a substantial fraction of the original taxonomic diversity. Sampling biases in the databases necessitate building phylogenetic data sets that have large numbers of missing entries. However, an analysis of two "supermatrices" suggests that even data sets with as much as 92% missing data can provide insights into broad sections of the tree of life.
引用
收藏
页码:1172 / 1174
页数:3
相关论文
共 25 条
[1]  
ALEXE G, 2002, 200252 DIMACS
[2]   Bayesian gene/species tree reconciliation and orthology analysis using MCMC [J].
Arvestad, Lars ;
Berglund, Ann-Charlotte ;
Lagergren, Jens ;
Sennblad, Bengt .
BIOINFORMATICS, 2003, 19 :i7-i15
[3]   The analysis of 100 genes supports the grouping of three highly divergent amoebae:: Dictyostelium, Entamoeba, and Mastigamoeba [J].
Bapteste, E ;
Brinkmann, H ;
Lee, JA ;
Moore, DV ;
Sensen, CW ;
Gordon, P ;
Duruflé, L ;
Gaasterland, T ;
Lopez, P ;
Müller, M ;
Philippe, H .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (03) :1414-1419
[4]  
Bininda-Emonds O R, 2001, Pac Symp Biocomput, P547
[5]   The guinea-pig is not a rodent [J].
DErchia, AM ;
Gissi, C ;
Pesole, G ;
Saccone, C ;
Arnason, U .
NATURE, 1996, 381 (6583) :597-600
[6]  
DONDOSHANSKY I, 2002, BLASTCLUST VERS 6 1
[7]   18S gene trees are positively misleading for monocot/dicot phylogenetics [J].
Duvall, MR ;
Ervin, AB .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2004, 30 (01) :97-106
[8]  
Erdos PL, 1999, RANDOM STRUCT ALGOR, V14, P153, DOI 10.1002/(SICI)1098-2418(199903)14:2<153::AID-RSA3>3.0.CO
[9]  
2-R