Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels

被引:1007
作者
Schulz, Marcel H. [1 ,2 ,3 ,4 ]
Zerbino, Daniel R. [1 ,4 ]
Vingron, Martin [2 ]
Birney, Ewan [1 ]
机构
[1] European Bioinformat Inst, Hinxton CBS10 SD, Cambs, England
[2] Max Planck Inst Mol Genet, Dept Computat Mol Biol, D-14195 Berlin, Germany
[3] Carnegie Mellon Univ, Lane Ctr Computat Biol, Pittsburgh, PA 15213 USA
[4] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
关键词
MAMMALIAN TRANSCRIPTOMES; GENOME; SEQUENCES; REVEALS; GRAPHS; ABYSS; TOOL;
D O I
10.1093/bioinformatics/bts094
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High-throughput sequencing has made the analysis of new model organisms more affordable. Although assembling a new genome can still be costly and difficult, it is possible to use RNA-seq to sequence mRNA. In the absence of a known genome, it is necessary to assemble these sequences de novo, taking into account possible alternative isoforms and the dynamic range of expression values. Results: We present a software package named Oases designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo transcriptome assemblers.
引用
收藏
页码:1086 / 1092
页数:7
相关论文
共 33 条
[1]   De novo transcriptome assembly with ABySS [J].
Birol, Inanc ;
Jackman, Shaun D. ;
Nielsen, Cydney B. ;
Qian, Jenny Q. ;
Varhol, Richard ;
Stazyk, Greg ;
Morin, Ryan D. ;
Zhao, Yongjun ;
Hirst, Martin ;
Schein, Jacqueline E. ;
Horsman, Doug E. ;
Connors, Joseph M. ;
Gascoyne, Randy D. ;
Marra, Marco A. ;
Jones, Steven J. M. .
BIOINFORMATICS, 2009, 25 (21) :2872-2877
[2]   Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes [J].
Blencowe, Benjamin J. ;
Ahmad, Sidrah ;
Lee, Leo J. .
GENES & DEVELOPMENT, 2009, 23 (12) :1379-1386
[3]   ALLPATHS: De novo assembly of whole-genome shotgun microreads [J].
Butler, Jonathan ;
MacCallum, Iain ;
Kleber, Michael ;
Shlyakhter, Ilya A. ;
Belmonte, Matthew K. ;
Lander, Eric S. ;
Nusbaum, Chad ;
Jaffe, David B. .
GENOME RESEARCH, 2008, 18 (05) :810-820
[4]  
Collins LJ, 2008, GENOME INFORM SER, V21, P3
[5]   Annotating genomes with massive-scale RNA sequencing [J].
Denoeud, France ;
Aury, Jean-Marc ;
Da Silva, Corinne ;
Noel, Benjamin ;
Rogier, Odile ;
Delledonne, Massimo ;
Morgante, Michele ;
Valle, Giorgio ;
Wincker, Patrick ;
Scarpelli, Claude ;
Jaillon, Olivier ;
Artiguenave, Francois .
GENOME BIOLOGY, 2008, 9 (12)
[6]   Full-length transcriptome assembly from RNA-Seq data without a reference genome [J].
Grabherr, Manfred G. ;
Haas, Brian J. ;
Yassour, Moran ;
Levin, Joshua Z. ;
Thompson, Dawn A. ;
Amit, Ido ;
Adiconis, Xian ;
Fan, Lin ;
Raychowdhury, Raktima ;
Zeng, Qiandong ;
Chen, Zehua ;
Mauceli, Evan ;
Hacohen, Nir ;
Gnirke, Andreas ;
Rhind, Nicholas ;
di Palma, Federica ;
Birren, Bruce W. ;
Nusbaum, Chad ;
Lindblad-Toh, Kerstin ;
Friedman, Nir ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2011, 29 (07) :644-U130
[7]   Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs [J].
Guttman, Mitchell ;
Garber, Manuel ;
Levin, Joshua Z. ;
Donaghey, Julie ;
Robinson, James ;
Adiconis, Xian ;
Fan, Lin ;
Koziol, Magdalena J. ;
Gnirke, Andreas ;
Nusbaum, Chad ;
Rinn, John L. ;
Lander, Eric S. ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2010, 28 (05) :503-U166
[8]   Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing [J].
Heap, Graham A. ;
Yang, Jennie H. M. ;
Downes, Kate ;
Healy, Barry C. ;
Hunt, Karen A. ;
Bockett, Nicholas ;
Franke, Lude ;
Dubois, Patrick C. ;
Mein, Charles A. ;
Dobson, Richard J. ;
Albert, Thomas J. ;
Rodesch, Matthew J. ;
Clayton, David G. ;
Todd, John A. ;
van Heel, David A. ;
Plagnol, Vincent .
HUMAN MOLECULAR GENETICS, 2010, 19 (01) :122-134
[9]  
Heber Steffen, 2002, Bioinformatics, V18 Suppl 1, pS181
[10]   Parallel short sequence assembly of transcriptomes [J].
Jackson, Benjamin G. ;
Schnable, Patrick S. ;
Aluru, Srinivas .
BMC BIOINFORMATICS, 2009, 10