An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea

被引:3912
作者
McDonald, Daniel [3 ]
Price, Morgan N. [4 ]
Goodrich, Julia [3 ]
Nawrocki, Eric P. [5 ]
DeSantis, Todd Z. [7 ]
Probst, Alexander [6 ]
Andersen, Gary L. [6 ]
Knight, Rob [3 ,8 ]
Hugenholtz, Philip [1 ,2 ]
机构
[1] Univ Queensland, Australian Ctr Ecogen, Sch Chem & Mol Biosci, St Lucia, Qld 4072, Australia
[2] Univ Queensland, Inst Mol Biosci, St Lucia, Qld 4072, Australia
[3] Univ Colorado, Dept Chem & Biochem, Boulder, CO 80309 USA
[4] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Phys Biosci Div, Berkeley, CA 94720 USA
[5] Howard Hughes Med Inst, Ashburn, VA USA
[6] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Ctr Environm Biotechnol, Berkeley, CA 94720 USA
[7] Second Genome Inc, Dept Bioinformat, San Bruno, CA USA
[8] Univ Colorado, Howard Hughes Med Inst, Boulder, CO 80309 USA
基金
美国国家卫生研究院;
关键词
evolution; phylogenetics; taxonomy; RIBOSOMAL-RNA GENE; HUMAN MICROBIOME PROJECT; SEQUENCE DATA; ALIGNMENTS; DIVERSITY; DATABASE; TOOL; ARB; ASSIGNMENT; INFERENCE;
D O I
10.1038/ismej.2011.139
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408 315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/. The ISME Journal (2012) 6, 610-618; doi: 10.1038/ismej.2011.139; published online 1 December 2011
引用
收藏
页码:610 / 618
页数:9
相关论文
共 31 条
[1]   The Comparative RNA Web (CRW) Site:: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs -: art. no. 2 [J].
Cannone, JJ ;
Subramanian, S ;
Schnare, MN ;
Collett, JR ;
D'Souza, LM ;
Du, YS ;
Feng, B ;
Lin, N ;
Madabusi, LV ;
Müller, KM ;
Pande, N ;
Shang, ZD ;
Yu, N ;
Gutell, RR .
BMC BIOINFORMATICS, 2002, 3 (1)
[2]   PyNAST: a flexible tool for aligning sequences to a template alignment [J].
Caporaso, J. Gregory ;
Bittinger, Kyle ;
Bushman, Frederic D. ;
DeSantis, Todd Z. ;
Andersen, Gary L. ;
Knight, Rob .
BIOINFORMATICS, 2010, 26 (02) :266-267
[3]   EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences [J].
Chun, Jongsik ;
Lee, Jae-Hak ;
Jung, Yoonyoung ;
Kim, Myungjin ;
Kim, Seil ;
Kim, Byung Kwon ;
Lim, Young-Woon .
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, 2007, 57 :2259-2261
[4]   Toward automatic reconstruction of a highly resolved tree of life [J].
Ciccarelli, FD ;
Doerks, T ;
von Mering, C ;
Creevey, CJ ;
Snel, B ;
Bork, P .
SCIENCE, 2006, 311 (5765) :1283-1287
[5]   The Ribosomal Database Project: improved alignments and new tools for rRNA analysis [J].
Cole, J. R. ;
Wang, Q. ;
Cardenas, E. ;
Fish, J. ;
Chai, B. ;
Farris, R. J. ;
Kulam-Syed-Mohideen, A. S. ;
McGarrell, D. M. ;
Marsh, T. ;
Garrity, G. M. ;
Tiedje, J. M. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D141-D145
[6]   Automated group assignment in large phylogenetic trees using GRUNT: GRouping, ungrouping, naming tool [J].
Dalevi, Daniel ;
DeSantis, Todd Z. ;
Fredslund, Jakob ;
Andersen, Gary L. ;
Markowitz, Victor M. ;
Hugenholtz, Philip .
BMC BIOINFORMATICS, 2007, 8 (1)
[7]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072
[8]  
Dojka MA, 1998, APPL ENVIRON MICROB, V64, P3869
[9]   Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons [J].
Haas, Brian J. ;
Gevers, Dirk ;
Earl, Ashlee M. ;
Feldgarden, Mike ;
Ward, Doyle V. ;
Giannoukos, Georgia ;
Ciulla, Dawn ;
Tabbaa, Diana ;
Highlander, Sarah K. ;
Sodergren, Erica ;
Methe, Barbara ;
DeSantis, Todd Z. ;
Petrosino, Joseph F. ;
Knight, Rob ;
Birren, Bruce W. .
GENOME RESEARCH, 2011, 21 (03) :494-504
[10]   Novel division level bacterial diversity in a Yellowstone hot spring [J].
Hugenholtz, P ;
Pitulle, C ;
Hershberger, KL ;
Pace, NR .
JOURNAL OF BACTERIOLOGY, 1998, 180 (02) :366-376