Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers

被引:397
作者
Liu, Zongzhi [1 ]
DeSantis, Todd Z. [2 ]
Andersen, Gary L. [2 ]
Knight, Rob [1 ]
机构
[1] Univ Colorado, Dept Chem & Biochem, Boulder, CO 80309 USA
[2] Lawrence Berkeley Natl Lab, Ctr Environm Biotechnol, Berkeley, CA 94720 USA
关键词
D O I
10.1093/nar/gkn491
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The recent introduction of massively parallel pyrosequencers allows rapid, inexpensive analysis of microbial community composition using 16S ribosomal RNA (rRNA) sequences. However, a major challenge is to design a workflow so that taxonomic information can be accurately and rapidly assigned to each read, so that the composition of each community can be linked back to likely ecological roles played by members of each species, genus, family or phylum. Here, we use three large 16S rRNA datasets to test whether taxonomic information based on the full-length sequences can be recaptured by short reads that simulate the pyrosequencer outputs. We find that different taxonomic assignment methods vary radically in their ability to recapture the taxonomic information in full-length 16S rRNA sequences: most methods are sensitive to the region of the 16S rRNA gene that is targeted for sequencing, but many combinations of methods and rRNA regions produce consistent and accurate results. To process large datasets of partial 16S rRNA sequences obtained from surveys of various microbial communities, including those from human body habitats, we recommend the use of Greengenes or RDP classifier with fragments of at least 250 bases, starting from one of the primers R357, R534, R798, F343 or F517.
引用
收藏
页数:11
相关论文
共 31 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Predicting functional gene links from phylogenetic-statistical analyses of whole genomes [J].
Barker, D ;
Pagel, M .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (01) :24-31
[3]   The Use of Coded PCR Primers Enables High-Throughput Sequencing of Multiple Homolog Amplification Products by 454 Parallel Sequencing [J].
Binladen, Jonas ;
Gilbert, M. Thomas P. ;
Bollback, Jonathan P. ;
Panitz, Frank ;
Bendixen, Christian ;
Nielsen, Rasmus ;
Willerslev, Eske .
PLOS ONE, 2007, 2 (02)
[4]   The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data [J].
Cole, J. R. ;
Chai, B. ;
Farris, R. J. ;
Wang, Q. ;
Kulam-Syed-Mohideen, A. S. ;
McGarrell, D. M. ;
Bandela, A. M. ;
Cardenas, E. ;
Garrity, G. M. ;
Tiedje, J. M. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D169-D172
[5]   NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Keller, K. ;
Brodie, E. L. ;
Larsen, N. ;
Piceno, Y. M. ;
Phan, R. ;
Andersen, G. L. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W394-W399
[6]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072
[7]   High-density universal 16S rRNA microarray analysis reveals broader diversity than typical clone library when sampling the environment [J].
DeSantis, Todd Z. ;
Brodie, Eoin L. ;
Moberg, Jordan P. ;
Zubieta, Ingrid X. ;
Piceno, Yvette M. ;
Andersen, Gary L. .
MICROBIAL ECOLOGY, 2007, 53 (03) :371-383
[8]   Diversity of the human intestinal microbial flora [J].
Eckburg, PB ;
Bik, EM ;
Bernstein, CN ;
Purdom, E ;
Dethlefsen, L ;
Sargent, M ;
Gill, SR ;
Nelson, KE ;
Relman, DA .
SCIENCE, 2005, 308 (5728) :1635-1638
[9]   DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS [J].
FITCH, WM .
SYSTEMATIC ZOOLOGY, 1970, 19 (02) :99-&
[10]   Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex [J].
Hamady, Micah ;
Walker, Jeffrey J. ;
Harris, J. Kirk ;
Gold, Nicholas J. ;
Knight, Rob .
NATURE METHODS, 2008, 5 (03) :235-237