Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads

被引:25
作者
Davenport, Colin F. [1 ]
Neugebauer, Jens [1 ]
Beckmann, Nils [2 ]
Friedrich, Benedikt [2 ]
Kameri, Burim [2 ]
Kokott, Svea [1 ]
Paetow, Malte [2 ]
Siekmann, Bjoern [2 ]
Wieding-Drewes, Matthias [2 ]
Wienhoefer, Markus [2 ]
Wolf, Stefan [2 ]
Tuemmler, Burkhard [1 ]
Ahlers, Volker [2 ]
Sprengel, Frauke [2 ]
机构
[1] Hannover Med Sch, D-3000 Hannover, Lower Saxony, Germany
[2] Univ Appl Sci & Arts, Dept Comp Sci, Hannover, Lower Saxony, Germany
来源
PLOS ONE | 2012年 / 7卷 / 08期
基金
美国国家卫生研究院;
关键词
RIBOSOMAL-RNA; SEQUENCES; MICROBIOME; COMMUNITIES; ALIGNMENT; BACTERIA; TAXONOMY; ARCHAEA; SERVER; GENES;
D O I
10.1371/journal.pone.0041224
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.
引用
收藏
页数:8
相关论文
共 39 条
[1]  
Brady A, 2009, NAT METHODS, V6, P673, DOI [10.1038/nmeth.1358, 10.1038/NMETH.1358]
[2]   QIIME allows analysis of high-throughput community sequencing data [J].
Caporaso, J. Gregory ;
Kuczynski, Justin ;
Stombaugh, Jesse ;
Bittinger, Kyle ;
Bushman, Frederic D. ;
Costello, Elizabeth K. ;
Fierer, Noah ;
Pena, Antonio Gonzalez ;
Goodrich, Julia K. ;
Gordon, Jeffrey I. ;
Huttley, Gavin A. ;
Kelley, Scott T. ;
Knights, Dan ;
Koenig, Jeremy E. ;
Ley, Ruth E. ;
Lozupone, Catherine A. ;
McDonald, Daniel ;
Muegge, Brian D. ;
Pirrung, Meg ;
Reeder, Jens ;
Sevinsky, Joel R. ;
Tumbaugh, Peter J. ;
Walters, William A. ;
Widmann, Jeremy ;
Yatsunenko, Tanya ;
Zaneveld, Jesse ;
Knight, Rob .
NATURE METHODS, 2010, 7 (05) :335-336
[3]   Deep sequencing analysis of viruses infecting grapevines: Virome of a vineyard [J].
Coetzee, Beatrix ;
Freeborough, Michael-John ;
Maree, Hans J. ;
Celton, Jean-Marc ;
Rees, D. Jasper G. ;
Burger, Johan T. .
VIROLOGY, 2010, 400 (02) :157-163
[4]   Taxonomic classification of metagenomic shotgun sequences with CARMA3 [J].
Gerlach, Wolfgang ;
Stoye, Jens .
NUCLEIC ACIDS RESEARCH, 2011, 39 (14) :e91
[5]   MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks [J].
Gori, Fabio ;
Folino, Gianluigi ;
Jetten, Mike S. M. ;
Marchiori, Elena .
BIOINFORMATICS, 2011, 27 (02) :196-203
[6]   Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen [J].
Hess, Matthias ;
Sczyrba, Alexander ;
Egan, Rob ;
Kim, Tae-Wan ;
Chokhawala, Harshal ;
Schroth, Gary ;
Luo, Shujun ;
Clark, Douglas S. ;
Chen, Feng ;
Zhang, Tao ;
Mackie, Roderick I. ;
Pennacchio, Len A. ;
Tringe, Susannah G. ;
Visel, Axel ;
Woyke, Tanja ;
Wang, Zhong ;
Rubin, Edward M. .
SCIENCE, 2011, 331 (6016) :463-467
[7]   MEGAN analysis of metagenomic data [J].
Huson, Daniel H. ;
Auch, Alexander F. ;
Qi, Ji ;
Schuster, Stephan C. .
GENOME RESEARCH, 2007, 17 (03) :377-386
[8]   Towards a genome-based taxonomy for prokaryotes [J].
Konstantinidis, KT ;
Tiedje, JM .
JOURNAL OF BACTERIOLOGY, 2005, 187 (18) :6258-6264
[9]   Direct sequencing of the human microbiome readily reveals community differences [J].
Kuczynski, Justin ;
Costello, Elizabeth K. ;
Nemergut, Diana R. ;
Zaneveld, Jesse ;
Lauber, Christian L. ;
Knights, Dan ;
Koren, Omry ;
Fierer, Noah ;
Kelley, Scott T. ;
Ley, Ruth E. ;
Gordon, Jeffrey I. ;
Knight, Rob .
GENOME BIOLOGY, 2010, 11 (05)
[10]   Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes [J].
Kurokawa, Ken ;
Itoh, Takehiko ;
Kuwahara, Tomomi ;
Oshima, Kenshiro ;
Toh, Hidehiro ;
Toyoda, Atsushi ;
Takami, Hideto ;
Morita, Hidetoshi ;
Sharma, Vineet K. ;
Srivastava, Tulika P. ;
Taylor, Todd D. ;
Noguchi, Hideki ;
Mori, Hiroshi ;
Ogura, Yoshitoshi ;
Ehrlich, Dusko S. ;
Itoh, Kikuji ;
Takagi, Toshihisa ;
Sakaki, Yoshiyuki ;
Hayashi, Tetsuya ;
Hattori, Masahira .
DNA RESEARCH, 2007, 14 (04) :169-181