Leveraging transcript quantification for fast computation of alternative splicing profiles

被引:161
作者
Alamancos, Gael P. [1 ]
Pages, Amadis [1 ,2 ]
Trincado, Juan L. [1 ]
Bellora, Nicolas [3 ]
Eyras, Eduardo [1 ,4 ]
机构
[1] Univ Pompeu Fabra, E-08003 Barcelona, Spain
[2] Ctr Genom Regulat, E-08003 Barcelona, Spain
[3] CONICET UNComahue, INIBIOMA, RA-8400 San Carlos De Bariloche, Rio Negro, Argentina
[4] Catalan Inst Res & Adv Studies, E-08010 Barcelona, Spain
关键词
RNA-seq; splicing; splicing event; RNA-SEQ DATA; EXPRESSION; IDENTIFICATION; PROGRAMS; REVEALS; MARKERS; EVENTS; ROBUST;
D O I
10.1261/rna.051557.115
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Alternative splicing plays an essential role in many cellular processes and bears major relevance in the understanding of multiple diseases, including cancer. High-throughput RNA sequencing allows genome-wide analyses of splicing across multiple conditions. However, the increasing number of available data sets represents a major challenge in terms of computation time and storage requirements. We describe SUPPA, a computational tool to calculate relative inclusion values of alternative splicing events, exploiting fast transcript quantification. SUPPA accuracy is comparable and sometimes superior to standard methods using simulated as well as real RNA-sequencing data compared with experimentally validated events. We assess the variability in terms of the choice of annotation and provide evidence that using complete transcripts rather than more transcripts per gene provides better estimates. Moreover, SUPPA coupled with de novo transcript reconstruction methods does not achieve accuracies as high as using quantification of known transcripts, but remains comparable to existing methods. Finally, we show that SUPPA is more than 1000 times faster than standard methods. Coupled with fast transcript quantification, SUPPA provides inclusion values at a much higher speed than existing methods without compromising accuracy, thereby facilitating the systematic splicing analysis of large data sets with limited computational resources. The software is implemented in Python 2.7 and is available under the MIT license at https://bitbucket.org/regulatorygenomicsupf/suppa.
引用
收藏
页码:1521 / 1531
页数:11
相关论文
共 48 条
[1]   RBM5, 6, and 10 Differentially Regulate NUMB Alternative Splicing to Control Cancer Cell Proliferation [J].
Bechara, Elias G. ;
Sebestyen, Endre ;
Bernardis, Isabella ;
Eyras, Eduardo ;
Valcarcel, Juan .
MOLECULAR CELL, 2013, 52 (05) :720-733
[2]   MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples [J].
Behr, Jonas ;
Kahles, Andre ;
Zhong, Yi ;
Sreedharan, Vipin T. ;
Drewe, Philipp ;
Raetsch, Gunnar .
BIOINFORMATICS, 2013, 29 (20) :2529-2538
[3]   Conservation of an RNA regulatory map between Drosophila and mammals [J].
Brooks, Angela N. ;
Yang, Li ;
Duff, Michael O. ;
Hansen, Kasper D. ;
Park, Jung W. ;
Dudoit, Sandrine ;
Brenner, Steven E. ;
Graveley, Brenton R. .
GENOME RESEARCH, 2011, 21 (02) :193-202
[4]   Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged [J].
David, Charles J. ;
Manley, James L. .
GENES & DEVELOPMENT, 2010, 24 (21) :2343-2364
[5]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[6]   Ensembl 2014 [J].
Flicek, Paul ;
Amode, M. Ridwan ;
Barrell, Daniel ;
Beal, Kathryn ;
Billis, Konstantinos ;
Brent, Simon ;
Carvalho-Silva, Denise ;
Clapham, Peter ;
Coates, Guy ;
Fitzgerald, Stephen ;
Gil, Laurent ;
Giron, Carlos Garcia ;
Gordon, Leo ;
Hourlier, Thibaut ;
Hunt, Sarah ;
Johnson, Nathan ;
Juettemann, Thomas ;
Kaehaeri, Andreas K. ;
Keenan, Stephen ;
Kulesha, Eugene ;
Martin, Fergal J. ;
Maurel, Thomas ;
McLaren, William M. ;
Murphy, Daniel N. ;
Nag, Rishi ;
Overduin, Bert ;
Pignatelli, Miguel ;
Pritchard, Bethan ;
Pritchard, Emily ;
Riat, Harpreet S. ;
Ruffier, Magali ;
Sheppard, Daniel ;
Taylor, Kieron ;
Thormann, Anja ;
Trevanion, Stephen J. ;
Vullo, Alessandro ;
Wilder, Steven P. ;
Wilson, Mark ;
Zadissa, Amonida ;
Aken, Bronwen L. ;
Birney, Ewan ;
Cunningham, Fiona ;
Harrow, Jennifer ;
Herrero, Javier ;
Hubbard, Tim J. P. ;
Kinsella, Rhoda ;
Muffato, Matthieu ;
Parker, Anne ;
Spudich, Giulietta ;
Yates, Andy .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D749-D755
[7]   Modelling and simulating generic RNA-Seq experiments with the flux simulator [J].
Griebel, Thasso ;
Zacher, Benedikt ;
Ribeca, Paolo ;
Raineri, Emanuele ;
Lacroix, Vincent ;
Guigo, Roderic ;
Sammeth, Michael .
NUCLEIC ACIDS RESEARCH, 2012, 40 (20) :10073-10083
[8]  
Griffith M, 2010, NAT METHODS, V7, P843, DOI [10.1038/NMETH.1503, 10.1038/nmeth.1503]
[9]   Biases in Illumina transcriptome sequencing caused by random hexamer priming [J].
Hansen, Kasper D. ;
Brenner, Steven E. ;
Dudoit, Sandrine .
NUCLEIC ACIDS RESEARCH, 2010, 38 (12) :e131
[10]   DiffSplice: the genome-wide detection of differential splicing events with RNA-seq [J].
Hu, Yin ;
Huang, Yan ;
Du, Ying ;
Orellana, Christian F. ;
Singh, Darshan ;
Johnson, Amy R. ;
Monroy, Anais ;
Kuan, Pei-Fen ;
Hammond, Scott M. ;
Makowski, Liza ;
Randell, Scott H. ;
Chiang, Derek Y. ;
Hayes, D. Neil ;
Jones, Corbin ;
Liu, Yufeng ;
Prins, Jan F. ;
Liu, Jinze .
NUCLEIC ACIDS RESEARCH, 2013, 41 (02) :e39