Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples

被引:88
作者
Barb, Jennifer J. [1 ]
Oler, Andrew J. [2 ]
Kim, Hyung-Suk [3 ]
Chalmers, Natalia [4 ]
Wallen, Gwenyth R. [5 ]
Cashion, Ann [3 ]
Munson, Peter J. [1 ]
Ames, Nancy J. [5 ]
机构
[1] NIH, Math & Stat Comp Lab, Ctr Informat Technol, Bldg 10, Bethesda, MD 20892 USA
[2] NIAID, Bioinformat & Computat Biosci Branch, Off Cyber Infrastruct & Computat Biol, NIH, 9000 Rockville Pike, Bethesda, MD 20892 USA
[3] NINR, NIH, Bethesda, MD 20892 USA
[4] Natl Inst Dent & Craniofacial Res, NIH, Bethesda, MD USA
[5] NIH, Ctr Clin, Dept Nursing, Bldg 10, Bethesda, MD 20892 USA
来源
PLOS ONE | 2016年 / 11卷 / 02期
关键词
CLINICAL MICROBIOLOGY; DIVERSITY; SEQUENCES; IDENTIFICATION; PERFORMANCE; BACTERIA; PRIMERS;
D O I
10.1371/journal.pone.0148047
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Objectives There is much speculation on which hypervariable region provides the highest bacterial specificity in 16S rRNA sequencing. The optimum solution to prevent bias and to obtain a comprehensive view of complex bacterial communities would be to sequence the entire 16S rRNA gene; however, this is not possible with second generation standard library design and short-read next-generation sequencing technology. Methods This paper examines a new process using seven hypervariable or V regions of the 16S rRNA (six amplicons: V2, V3, V4, V6-7, V8, and V9) processed simultaneously on the Ion Torrent Personal Genome Machine (Life Technologies, Grand Island, NY). Four mock samples were amplified using the 16S Ion Metagenomics Kit (TM) (Life Technologies) and their sequencing data is subjected to a novel analytical pipeline. Results Results are presented at family and genus level. The Kullback-Leibler divergence (D-KL), a measure of the departure of the computed from the nominal bacterial distribution in the mock samples, was used to infer which region performed best at the family and genus levels. Three different hypervariable regions, V2, V4, and V6-7, produced the lowest divergence compared to the known mock sample. The V9 region gave the highest (worst) average D-KL while the V4 gave the lowest (best) average D-KL. In addition to having a high D-KL, the V9 region in both the forward and reverse directions performed the worst finding only 17% and 53% of the known family level and 12% and 47% of the genus level bacteria, while results from the forward and reverse V4 region identified all 17 family level bacteria. Conclusions The results of our analysis have shown that our sequencing methods using 6 hypervariable regions of the 16S rRNA and subsequent analysis is valid. This method also allowed for the assessment of how well each of the variable regions might perform simultaneously. Our findings will provide the basis for future work intended to assess microbial abundance at different time points throughout a clinical protocol.
引用
收藏
页数:18
相关论文
共 46 条
[1]  
[Anonymous], 2014, BBMAP ALIGNER DNA RN
[2]  
[Anonymous], 2014, 16 RRNA SEQ INT RES
[3]  
[Anonymous], 2014, BEI RESOURCES
[4]   Estimating Bacterial Diversity for Ecological Studies: Methods, Metrics, and Assumptions [J].
Birtel, Julia ;
Walser, Jean-Claude ;
Pichon, Samuel ;
Buergmann, Helmut ;
Matthews, Blake .
PLOS ONE, 2015, 10 (04)
[5]   Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms [J].
Caporaso, J. Gregory ;
Lauber, Christian L. ;
Walters, William A. ;
Berg-Lyons, Donna ;
Huntley, James ;
Fierer, Noah ;
Owens, Sarah M. ;
Betley, Jason ;
Fraser, Louise ;
Bauer, Markus ;
Gormley, Niall ;
Gilbert, Jack A. ;
Smith, Geoff ;
Knight, Rob .
ISME JOURNAL, 2012, 6 (08) :1621-1624
[6]   Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample [J].
Caporaso, J. Gregory ;
Lauber, Christian L. ;
Walters, William A. ;
Berg-Lyons, Donna ;
Lozupone, Catherine A. ;
Turnbaugh, Peter J. ;
Fierer, Noah ;
Knight, Rob .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 :4516-4522
[7]   QIIME allows analysis of high-throughput community sequencing data [J].
Caporaso, J. Gregory ;
Kuczynski, Justin ;
Stombaugh, Jesse ;
Bittinger, Kyle ;
Bushman, Frederic D. ;
Costello, Elizabeth K. ;
Fierer, Noah ;
Pena, Antonio Gonzalez ;
Goodrich, Julia K. ;
Gordon, Jeffrey I. ;
Huttley, Gavin A. ;
Kelley, Scott T. ;
Knights, Dan ;
Koenig, Jeremy E. ;
Ley, Ruth E. ;
Lozupone, Catherine A. ;
McDonald, Daniel ;
Muegge, Brian D. ;
Pirrung, Meg ;
Reeder, Jens ;
Sevinsky, Joel R. ;
Tumbaugh, Peter J. ;
Walters, William A. ;
Widmann, Jeremy ;
Yatsunenko, Tanya ;
Zaneveld, Jesse ;
Knight, Rob .
NATURE METHODS, 2010, 7 (05) :335-336
[8]   Nonparametric estimation of Shannon's index of diversity when there are unseen species in sample [J].
Chao, A ;
Shen, TJ .
ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2003, 10 (04) :429-443
[9]   Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions [J].
Claesson, Marcus J. ;
Wang, Qiong ;
O'Sullivan, Orla ;
Greene-Diniz, Rachel ;
Cole, James R. ;
Ross, R. Paul ;
O'Toole, Paul W. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (22) :e200
[10]   Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases [J].
Clarridge, JE .
CLINICAL MICROBIOLOGY REVIEWS, 2004, 17 (04) :840-+