Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution

被引:35
作者
Carr, Rogan [1 ]
Shen-Orr, Shai S. [2 ,3 ]
Borenstein, Elhanan [1 ,4 ,5 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Technion Israel Inst Technol, Fac Med, Rappaport Inst Med Res, Dept Immunol, Haifa, Israel
[3] Technion Israel Inst Technol, Fac Biol, Haifa, Israel
[4] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
[5] Santa Fe Inst, Santa Fe, NM 87501 USA
基金
美国国家卫生研究院;
关键词
GUT MICROBIOME; ALGORITHM; SEQUENCES; DIVERSITY; EVOLUTION; DATABASE; CATALOG; READS; AGE;
D O I
10.1371/journal.pcbi.1003292
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic data, this deconvolution framework provides an essential tool for characterizing microbial taxa never before seen, laying the foundation for addressing fundamental questions concerning the taxa comprising diverse microbial communities.
引用
收藏
页数:15
相关论文
共 68 条
[51]   Genomic variation landscape of the human gut microbiome [J].
Schloissnig, Siegfried ;
Arumugam, Manimozhiyan ;
Sunagawa, Shinichi ;
Mitreva, Makedonka ;
Tap, Julien ;
Zhu, Ana ;
Waller, Alison ;
Mende, Daniel R. ;
Kultima, Jens Roat ;
Martin, John ;
Kota, Karthik ;
Sunyaev, Shamil R. ;
Weinstock, George M. ;
Bork, Peer .
NATURE, 2013, 493 (7430) :45-50
[52]   Metagenomics for studying unculturable microorganisms: cutting the Gordian knot [J].
Schloss, PD ;
Handelsman, J .
GENOME BIOLOGY, 2005, 6 (08)
[53]  
Segata N, 2012, NAT METHODS, V9, P811, DOI [10.1038/NMETH.2066, 10.1038/nmeth.2066]
[54]   Composition of the adult digestive tract bacterial microbiome based on seven mouth surfaces, tonsils, throat and stool samples [J].
Segata, Nicola ;
Haake, Susan Kinder ;
Mannon, Peter ;
Lemon, Katherine P. ;
Waldron, Levi ;
Gevers, Dirk ;
Huttenhower, Curtis ;
Izard, Jacques .
GENOME BIOLOGY, 2012, 13 (06) :R42
[55]   Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization [J].
Sharon, Itai ;
Morowitz, Michael J. ;
Thomas, Brian C. ;
Costello, Elizabeth K. ;
Relman, David A. ;
Banfield, Jillian F. .
GENOME RESEARCH, 2013, 23 (01) :111-120
[56]   PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity and Resolves Novel Taxa from Metagenomic Data [J].
Sharpton, Thomas J. ;
Riesenfeld, Samantha J. ;
Kembel, Steven W. ;
Ladau, Joshua ;
O'Dwyer, James P. ;
Green, Jessica L. ;
Eisen, Jonathan A. ;
Pollard, Katherine S. .
PLOS COMPUTATIONAL BIOLOGY, 2011, 7 (01)
[57]  
Shen-Orr SS, 2010, NAT METHODS, V7, P287, DOI [10.1038/nmeth.1439, 10.1038/NMETH.1439]
[58]   The COG database: an updated version includes eukaryotes [J].
Tatusov, RL ;
Fedorova, ND ;
Jackson, JD ;
Jacobs, AR ;
Kiryutin, B ;
Koonin, EV ;
Krylov, DM ;
Mazumder, R ;
Mekhedov, SL ;
Nikolskaya, AN ;
Rao, BS ;
Smirnov, S ;
Sverdlov, AV ;
Vasudevan, S ;
Wolf, YI ;
Yin, JJ ;
Natale, DA .
BMC BIOINFORMATICS, 2003, 4 (1)
[59]   Application of tetranucleotide frequencies for the assignment of genomic fragments [J].
Teeling, H ;
Meyerdierks, A ;
Bauer, M ;
Amann, R ;
Glöckner, FO .
ENVIRONMENTAL MICROBIOLOGY, 2004, 6 (09) :938-947