NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes

被引:848
作者
DeSantis, T. Z. [1 ]
Hugenholtz, P.
Keller, K.
Brodie, E. L.
Larsen, N.
Piceno, Y. M.
Phan, R.
Andersen, G. L.
机构
[1] Univ Calif Berkeley, Lawrence Berkeley Lab, Ctr Environm Biotechnol, Berkeley, CA 94720 USA
[2] DOE Joint Genome Inst, Microbial Ecol Program, Walnut Creek, CA USA
[3] Danish Genome Inst, Aarhus, Denmark
[4] Lawrence Berkeley Natl Lab, Virtual Inst Microbial Stress & Survival, Berkeley, CA USA
[5] Univ Calif Berkeley, Quantitat Biomed Res, Berkeley, CA 94720 USA
关键词
D O I
10.1093/nar/gkl244
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Microbiologists conducting surveys of bacterial and archaeal diversity often require comparative alignments of thousands of 16S rRNA genes collected from a sample. The computational resources and bioinformatics expertise required to construct such an alignment has inhibited high- throughput analysis. It was hypothesized that an online tool could be developed to efficiently align thousands of 16S rRNA genes via the NAST ( Nearest Alignment Space Termination) algorithm for creating multiple sequence alignments ( MSA). The tool was implemented with a web- interface at http://greengenes.lbl.gov/ NAST. Each user- submitted sequence is compared with Greengenes' 'Core Set', comprising similar to 10 000 aligned non- chimeric sequences representative of the currently recognized diversity among bacteria and archaea. User sequences are oriented and paired with their closest match in the Core Set to serve as a template for inserting gap characters. Non-16S data ( sequence from vector or surrounding genomic regions) are conveniently removed in the returned alignment. From the resulting MSA, distancematrices can be calculated for diversity estimates and organisms can be classified by taxonomy. The ability to align and categorize large sequence sets using a simple interface has enabled researchers with various experience levels to obtain bacterial and archaeal community profiles.
引用
收藏
页码:W394 / W399
页数:6
相关论文
共 24 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis [J].
Cole, JR ;
Chai, B ;
Farris, RJ ;
Wang, Q ;
Kulam, SA ;
McGarrell, DM ;
Garrity, GM ;
Tiedje, JM .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D294-D296
[3]   Comprehensive aligned sequence construction for automated design of effective probes (CASCADE-P) using 16S rDNA [J].
DeSantis, TZ ;
Dubosarskiy, I ;
Murray, SR ;
Andersen, GL .
BIOINFORMATICS, 2003, 19 (12) :1461-1468
[4]  
DESANTIS TZ, 2006, IN PRESS APPL ENV MI
[5]   Diversity of the human intestinal microbial flora [J].
Eckburg, PB ;
Bik, EM ;
Bernstein, CN ;
Purdom, E ;
Dethlefsen, L ;
Sargent, M ;
Gill, SR ;
Nelson, KE ;
Relman, DA .
SCIENCE, 2005, 308 (5728) :1635-1638
[6]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[7]  
Felsenstein J., 2005, PHYLIP PHYLOGENY INF, DOI DOI 10.1111/J.1096-0031.1989.TB00562.X
[8]   THE PHYLOGENY OF PROKARYOTES [J].
FOX, GE ;
STACKEBRANDT, E ;
HESPELL, RB ;
GIBSON, J ;
MANILOFF, J ;
DYER, TA ;
WOLFE, RS ;
BALCH, WE ;
TANNER, RS ;
MAGRUM, LJ ;
ZABLEN, LB ;
BLAKEMORE, R ;
GUPTA, R ;
BONEN, L ;
LEWIS, BJ ;
STAHL, DA ;
LUEHRSEN, KR ;
CHEN, KN ;
WOESE, CR .
SCIENCE, 1980, 209 (4455) :457-463
[9]   Physiological and community responses of established grassland bacterial populations to water stress [J].
Griffiths, RI ;
Whiteley, AS ;
O'Donnell, AG ;
Bailey, MJ .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2003, 69 (12) :6961-6968
[10]   Bellerophon: a program to detect chimeric sequences in multiple sequence alignments [J].
Huber, T ;
Faulkner, G ;
Hugenholtz, P .
BIOINFORMATICS, 2004, 20 (14) :2317-2319