IDBA-MT: De Novo Assembler for Metatranscriptomic Data Generated from Next-Generation Sequencing Technology

被引:28
作者
Leung, Henry C. M. [1 ]
Yiu, Siu-Ming [1 ]
Parkinson, John [2 ]
Chin, Francis Y. L. [1 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] Univ Toronto, Toronto, ON, Canada
关键词
algorithms; alignment; computational molecular biology; dynamic programming; genomic rearrangements; metagenomics; next generation sequencing; GENE-EXPRESSION; GUT; TRANSCRIPTOME; HOST;
D O I
10.1089/cmb.2013.0042
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput next-generation sequencing technology provides a great opportunity for analyzing metatranscriptomic data. However, the reads produced by these technologies are short and an assembling step is required to combine the short reads into longer contigs. As there are many repeat patterns in mRNAs from different genomes and the abundance ratio of mRNAs in a sample varies a lot, existing assemblers for genomic data, transcriptomic data, and metagenomic data do not work on metatranscriptomic data and produce chimeric contigs, that is, incorrect contigs formed by merging multiple mRNA sequences. To our best knowledge, there is no assembler designed for metatranscriptomic data. In this article, we introduce an assembler called IDBA-MT, which is designed for assembling reads from metatranscriptomic data. IDBA-MT produces much fewer chimeric contigs (reduce by 50% or more) when compared with existing assemblers such as Oases, IDBA-UD, and Trinity.
引用
收藏
页码:540 / 550
页数:11
相关论文
共 30 条
[1]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[2]   Metatranscriptome Analysis of the Human Fecal Microbiota Reveals Subject-Specific Expression Profiles, with Genes Encoding Proteins Involved in Carbohydrate Metabolism Being Dominantly Expressed [J].
Booijink, Carien C. G. M. ;
Boekhorst, Jos ;
Zoetendal, Erwin G. ;
Smidt, Hauke ;
Kleerebezem, Michiel ;
de Vos, Willem M. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2010, 76 (16) :5533-5540
[3]   Environmental shotgun sequencing: Its potential and challenges for studying the hidden world of microbes [J].
Eisen, Jonathan A. .
PLOS BIOLOGY, 2007, 5 (03) :384-388
[4]   Microbial community gene expression in ocean surface waters [J].
Frias-Lopez, Jorge. ;
Shi, Yanmei ;
Tyson, Gene W. ;
Coleman, Maureen L. ;
Schuster, Stephan C. ;
Chisholm, Sallie W. ;
DeLong, Edward F. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (10) :3805-3810
[5]   Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses [J].
Fullwood, Melissa J. ;
Wei, Chia-Lin ;
Liu, Edison T. ;
Ruan, Yijun .
GENOME RESEARCH, 2009, 19 (04) :521-532
[6]   Detection of Large Numbers of Novel Sequences in the Metatranscriptomes of Complex Marine Microbial Communities [J].
Gilbert, Jack A. ;
Field, Dawn ;
Huang, Ying ;
Edwards, Rob ;
Li, Weizhong ;
Gilna, Paul ;
Joint, Ian .
PLOS ONE, 2008, 3 (08)
[7]   Conserved Amino Acid Sequence Features in the α Subunits of MoFe, VFe, and FeFe Nitrogenases [J].
Glazer, Alexander N. ;
Kechris, Katerina J. .
PLOS ONE, 2009, 4 (07)
[8]   Full-length transcriptome assembly from RNA-Seq data without a reference genome [J].
Grabherr, Manfred G. ;
Haas, Brian J. ;
Yassour, Moran ;
Levin, Joshua Z. ;
Thompson, Dawn A. ;
Amit, Ido ;
Adiconis, Xian ;
Fan, Lin ;
Raychowdhury, Raktima ;
Zeng, Qiandong ;
Chen, Zehua ;
Mauceli, Evan ;
Hacohen, Nir ;
Gnirke, Andreas ;
Rhind, Nicholas ;
di Palma, Federica ;
Birren, Bruce W. ;
Nusbaum, Chad ;
Lindblad-Toh, Kerstin ;
Friedman, Nir ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2011, 29 (07) :644-U130
[9]   PCAP: A whole-genome assembly program [J].
Huang, XQ ;
Wang, JM ;
Aluru, S ;
Yang, SP ;
Hillier, L .
GENOME RESEARCH, 2003, 13 (09) :2164-2170
[10]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202, 10.1101/gr.229202. Article published online before March 2002]