Automated group assignment in large phylogenetic trees using GRUNT: GRouping, ungrouping, naming tool

被引:8
作者
Dalevi, Daniel [2 ]
DeSantis, Todd Z. [3 ]
Fredslund, Jakob [4 ]
Andersen, Gary L. [3 ]
Markowitz, Victor M. [2 ]
Hugenholtz, Philip [1 ]
机构
[1] DOE Joint Genome Inst, Microbial Ecol Program, Walnut Creek, CA 94598 USA
[2] Lawrence Berkeley Natl Lab, Biol Data Management & Technol Ctr, Berkeley, CA 94720 USA
[3] Lawrence Berkeley Natl Lab, Ctr Environm Biotechnol, Berkeley, CA 94720 USA
[4] Univ Aerhus, Bioinformat Res Ctr, DK-8000 Aarhus C, Denmark
关键词
D O I
10.1186/1471-2105-8-402
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Accurate taxonomy is best maintained if species are arranged as hierarchical groups in phylogenetic trees. This is especially important as trees grow larger as a consequence of a rapidly expanding sequence database. Hierarchical group names are typically manually assigned in trees, an approach that becomes unfeasible for very large topologies. Results: We have developed an automated iterative procedure for delineating stable ( monophyletic) hierarchical groups to large ( or small) trees and naming those groups according to a set of sequentially applied rules. In addition, we have created an associated ungrouping tool for removing existing groups that do not meet user-defined criteria ( such as monophyly). The procedure is implemented in a program called GRUNT (GRouping, Ungrouping, Naming Tool) and has been applied to the current release of the Greengenes (Hugenholtz) 16S rRNA gene taxonomy comprising more than 130,000 taxa. Conclusion: GRUNT will facilitate researchers requiring comprehensive hierarchical grouping of large tree topologies in, for example, database curation, microarray design and pangenome assignments. The application is available at the greengenes website [ 1].
引用
收藏
页数:6
相关论文
共 13 条
[1]  
[Anonymous], 2006, THESIS U TEXAS AUSTI
[2]   Urban aerosols harbor diverse and dynamic bacterial populations [J].
Brodie, Eoin L. ;
DeSantis, Todd Z. ;
Parker, Jordan P. Moberg ;
Zubietta, Ingrid X. ;
Piceno, Yvette M. ;
Andersen, Gary L. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (01) :299-304
[3]   Application of a high-density oligonucleotide microarray approach to study bacterial population dynamics during uranium reduction and reoxidation [J].
Brodie, Eoin L. ;
DeSantis, Todd Z. ;
Joyner, Dominique C. ;
Baek, Seung M. ;
Larsen, Joern T. ;
Andersen, Gary L. ;
Hazen, Terry C. ;
Richardson, Paul M. ;
Herman, Donald J. ;
Tokunaga, Tetsu K. ;
Wan, Jiamin M. ;
Firestone, Mary K. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (09) :6288-6298
[4]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072
[5]  
DESANTIS TZ, 2007, MICROB ECOL
[6]  
FLANAGAN JL, 2007, J CLIN MICROBIOL
[7]   Bacterial phylogeny based on comparative sequence analysis [J].
Ludwig, W ;
Strunk, O ;
Klugbauer, S ;
Klugbauer, N ;
Weizenegger, M ;
Neumaier, J ;
Bachleitner, M ;
Schleifer, KH .
ELECTROPHORESIS, 1998, 19 (04) :554-568
[8]   ARB:: a software environment for sequence data [J].
Ludwig, W ;
Strunk, O ;
Westram, R ;
Richter, L ;
Meier, H ;
Yadhukumar ;
Buchner, A ;
Lai, T ;
Steppi, S ;
Jobb, G ;
Förster, W ;
Brettske, I ;
Gerber, S ;
Ginhart, AW ;
Gross, O ;
Grumann, S ;
Hermann, S ;
Jost, R ;
König, A ;
Liss, T ;
Lüssmann, R ;
May, M ;
Nonhoff, B ;
Reichel, B ;
Strehlow, R ;
Stamatakis, A ;
Stuckmann, N ;
Vilbig, A ;
Lenke, M ;
Ludwig, T ;
Bode, A ;
Schleifer, KH .
NUCLEIC ACIDS RESEARCH, 2004, 32 (04) :1363-1371
[9]  
PRUESSE E, 2007, IN PRESS NUCL ACIDS
[10]   MrBayes 3: Bayesian phylogenetic inference under mixed models [J].
Ronquist, F ;
Huelsenbeck, JP .
BIOINFORMATICS, 2003, 19 (12) :1572-1574