Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels

被引:1007
作者
Schulz, Marcel H. [1 ,2 ,3 ,4 ]
Zerbino, Daniel R. [1 ,4 ]
Vingron, Martin [2 ]
Birney, Ewan [1 ]
机构
[1] European Bioinformat Inst, Hinxton CBS10 SD, Cambs, England
[2] Max Planck Inst Mol Genet, Dept Computat Mol Biol, D-14195 Berlin, Germany
[3] Carnegie Mellon Univ, Lane Ctr Computat Biol, Pittsburgh, PA 15213 USA
[4] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
关键词
MAMMALIAN TRANSCRIPTOMES; GENOME; SEQUENCES; REVEALS; GRAPHS; ABYSS; TOOL;
D O I
10.1093/bioinformatics/bts094
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High-throughput sequencing has made the analysis of new model organisms more affordable. Although assembling a new genome can still be costly and difficult, it is possible to use RNA-seq to sequence mRNA. In the absence of a known genome, it is necessary to assemble these sequences de novo, taking into account possible alternative isoforms and the dynamic range of expression values. Results: We present a software package named Oases designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo transcriptome assemblers.
引用
收藏
页码:1086 / 1092
页数:7
相关论文
共 33 条
[21]   ABySS: A parallel assembler for short read sequence data [J].
Simpson, Jared T. ;
Wong, Kim ;
Jackman, Shaun D. ;
Schein, Jacqueline E. ;
Jones, Steven J. M. ;
Birol, Inanc .
GENOME RESEARCH, 2009, 19 (06) :1117-1123
[22]   CONDETRI - A Content Dependent Read Trimmer for Illumina Data [J].
Smeds, Linnea ;
Kunstner, Axel .
PLOS ONE, 2011, 6 (10)
[23]   A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome [J].
Sultan, Marc ;
Schulz, Marcel H. ;
Richard, Hugues ;
Magen, Alon ;
Klingenhoff, Andreas ;
Scherf, Matthias ;
Seifert, Martin ;
Borodina, Tatjana ;
Soldatov, Aleksey ;
Parkhomchuk, Dmitri ;
Schmidt, Dominic ;
O'Keeffe, Sean ;
Haas, Stefan ;
Vingron, Martin ;
Lehrach, Hans ;
Yaspo, Marie-Laure .
SCIENCE, 2008, 321 (5891) :956-960
[24]   Optimization of de novo transcriptome assembly from next-generation sequencing data [J].
Surget-Groba, Yann ;
Montoya-Burgos, Juan I. .
GENOME RESEARCH, 2010, 20 (10) :1432-1440
[25]   Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation [J].
Trapnell, Cole ;
Williams, Brian A. ;
Pertea, Geo ;
Mortazavi, Ali ;
Kwan, Gordon ;
van Baren, Marijke J. ;
Salzberg, Steven L. ;
Wold, Barbara J. ;
Pachter, Lior .
NATURE BIOTECHNOLOGY, 2010, 28 (05) :511-U174
[26]   Full-Malaria/Parasites and Full-Arthropods: databases of full-length cDNAs of parasites and arthropods, update 2009 [J].
Wakaguri, Hiroyuki ;
Suzuki, Yutaka ;
Katayama, Toshiaki ;
Kawashima, Shuichi ;
Kibukawa, Eri ;
Hiranuka, Kazushi ;
Sasaki, Masahide ;
Sugano, Sumio ;
Watanabe, Junichi .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D520-D525
[27]   Alternative isoform regulation in human tissue transcriptomes [J].
Wang, Eric T. ;
Sandberg, Rickard ;
Luo, Shujun ;
Khrebtukova, Irina ;
Zhang, Lu ;
Mayr, Christine ;
Kingsmore, Stephen F. ;
Schroth, Gary P. ;
Burge, Christopher B. .
NATURE, 2008, 456 (7221) :470-476
[28]   RNA-Seq: a revolutionary tool for transcriptomics [J].
Wang, Zhong ;
Gerstein, Mark ;
Snyder, Michael .
NATURE REVIEWS GENETICS, 2009, 10 (01) :57-63
[29]   RazerS-fast read mapping with sensitivity control [J].
Weese, David ;
Emde, Anne-Katrin ;
Rausch, Tobias ;
Doering, Andreas ;
Reinert, Knut .
GENOME RESEARCH, 2009, 19 (09) :1646-1654
[30]   An analysis of the feasibility of short read sequencing -: art. no. E171 [J].
Whiteford, N ;
Haslam, N ;
Weber, G ;
Prügel-Bennett, A ;
Essex, JW ;
Roach, PL ;
Bradley, M ;
Neylon, C .
NUCLEIC ACIDS RESEARCH, 2005, 33 (19) :1-6