Assessing and Improving Methods Used in Operational Taxonomic Unit-Based Approaches for 16S rRNA Gene Sequence Analysis

被引:547
作者
Schloss, Patrick D. [1 ]
Westcott, Sarah L. [1 ]
机构
[1] Univ Michigan, Dept Microbiol & Immunol, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
ESTIMATING SPECIES RICHNESS; RARE BIOSPHERE; PROGRAM; DEFINITION; DIVERSITY; WRINKLES; ARB;
D O I
10.1128/AEM.02810-10
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
In spite of technical advances that have provided increases in orders of magnitude in sequencing coverage, microbial ecologists still grapple with how to interpret the genetic diversity represented by the 16S rRNA gene. Two widely used approaches put sequences into bins based on either their similarity to reference sequences (i.e., phylotyping) or their similarity to other sequences in the community (i.e., operational taxonomic units [OTUs]). In the present study, we investigate three issues related to the interpretation and implementation of OTU-based methods. First, we confirm the conventional wisdom that it is impossible to create an accurate distance-based threshold for defining taxonomic levels and instead advocate for a consensus-based method of classifying OTUs. Second, using a taxonomic-independent approach, we show that the average neighbor clustering algorithm produces more robust OTUs than other hierarchical and heuristic clustering algorithms. Third, we demonstrate several steps to reduce the computational burden of forming OTUs without sacrificing the robustness of the OTU assignment. Finally, by blending these solutions, we propose a new heuristic that has a minimal effect on the robustness of OTUs and significantly reduces the necessary time and memory requirements. The ability to quickly and accurately assign sequences to OTUs and then obtain taxonomic information for those OTUs will greatly improve OTU-based analyses and overcome many of the challenges encountered with phylotype-based methods.
引用
收藏
页码:3219 / 3226
页数:8
相关论文
共 25 条
[1]   At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies [J].
Ashelford, KE ;
Chuzhanova, NA ;
Fry, JC ;
Jones, AJ ;
Weightman, AJ .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2005, 71 (12) :7724-7736
[2]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[3]   What are bacterial species? [J].
Cohan, FM .
ANNUAL REVIEW OF MICROBIOLOGY, 2002, 56 :457-487
[4]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072
[5]   Search and clustering orders of magnitude faster than BLAST [J].
Edgar, Robert C. .
BIOINFORMATICS, 2010, 26 (19) :2460-2461
[6]   DNA-DNA hybridization values and their relationship to whole-genome sequence similarities [J].
Goris, Johan ;
Konstantinidis, Konstantinos T. ;
Klappenbach, Joel A. ;
Coenye, Tom ;
Vandamme, Peter ;
Tiedje, James M. .
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, 2007, 57 :81-91
[7]   Bellerophon: a program to detect chimeric sequences in multiple sequence alignments [J].
Huber, T ;
Faulkner, G ;
Hugenholtz, P .
BIOINFORMATICS, 2004, 20 (14) :2317-2319
[8]   Impact of culture-independent studies on the emerging phylogenetic view of bacterial diversity [J].
Hugenholtz, P ;
Goebel, BM ;
Pace, NR .
JOURNAL OF BACTERIOLOGY, 1998, 180 (18) :4765-4774
[9]   Ironing out the wrinkles in the rare biosphere through improved OTU clustering [J].
Huse, Susan M. ;
Welch, David Mark ;
Morrison, Hilary G. ;
Sogin, Mitchell L. .
ENVIRONMENTAL MICROBIOLOGY, 2010, 12 (07) :1889-1898
[10]   Exploring Microbial Diversity and Taxonomy Using SSU rRNA Hypervariable Tag Sequencing [J].
Huse, Susan M. ;
Dethlefsen, Les ;
Huber, Julie A. ;
Welch, David Mark ;
Relman, David A. ;
Sogin, Mitchell L. .
PLOS GENETICS, 2008, 4 (11)