KISSPLICE: de-novo calling alternative splicing events from RNA-seq data

被引:65
作者
Sacomoto, Gustavo A. T. [2 ,3 ,4 ]
Kielbassa, Janice [2 ,3 ,4 ]
Chikhi, Rayan [1 ]
Uricaru, Raluca [1 ,5 ]
Antoniou, Pavlos [1 ]
Sagot, Marie-France [2 ,3 ,4 ]
Peterlongo, Pierre [1 ]
Lacroix, Vincent [2 ,3 ,4 ]
机构
[1] IRISA, Ctr Rech INRIA Rennes Bretagne Atlantique, Rennes, France
[2] INRIA Grenoble Rhone Alpes, Grenoble, France
[3] Univ Lyon, F-69000 Lyon, France
[4] Univ Lyon 1, CNRS, UMR5558, Lab Biometrie & Biol Evolut, F-69622 Villeurbanne, France
[5] INRA, UMR118, F-35042 Rennes, France
来源
BMC BIOINFORMATICS | 2012年 / 13卷
基金
欧洲研究理事会;
关键词
Short Path; Reference Genome; Splice Event; Variable Part; Alternative Transcript;
D O I
10.1186/1471-2105-13-S6-S5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In this paper, we address the problem of identifying and quantifying polymorphisms in RNA-seq data when no reference genome is available, without assembling the full transcripts. Based on the fundamental idea that each polymorphism corresponds to a recognisable pattern in a De Bruijn graph constructed from the RNA-seq reads, we propose a general model for all polymorphisms in such graphs. We then introduce an exact algorithm, called KISSPLICE, to extract alternative splicing events. Results: We show that KISSPLICE enables to identify more correct events than general purpose transcriptome assemblers. Additionally, on a 71 M reads dataset from human brain and liver tissues, KISSPLICE identified 3497 alternative splicing events, out of which 56% are not present in the annotations, which confirms recent estimates showing that the complexity of alternative splicing has been largely underestimated so far. Conclusions: We propose new models and algorithms for the detection of polymorphism in RNA-seq data. This opens the way to a new kind of studies on large HTS RNA-seq datasets, where the focus is not the global reconstruction of full-length transcripts, but local assembly of polymorphic regions. KISSPLICE is available for download at http://alcovna.genouest.org/kissplice/.
引用
收藏
页数:12
相关论文
共 21 条
[1]  
Alvarez Martin J.A., 2011, North American Power Symposium NAPS, P1
[2]   Full-length transcriptome assembly from RNA-Seq data without a reference genome [J].
Grabherr, Manfred G. ;
Haas, Brian J. ;
Yassour, Moran ;
Levin, Joshua Z. ;
Thompson, Dawn A. ;
Amit, Ido ;
Adiconis, Xian ;
Fan, Lin ;
Raychowdhury, Raktima ;
Zeng, Qiandong ;
Chen, Zehua ;
Mauceli, Evan ;
Hacohen, Nir ;
Gnirke, Andreas ;
Rhind, Nicholas ;
di Palma, Federica ;
Birren, Bruce W. ;
Nusbaum, Chad ;
Lindblad-Toh, Kerstin ;
Friedman, Nir ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2011, 29 (07) :644-U130
[3]  
Iqbal Z., 2012, NATURE GENETICS
[4]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202, 10.1101/gr.229202. Article published online before March 2002]
[5]   The UCSC Genome Browser Database: update 2009 [J].
Kuhn, R. M. ;
Karolchik, D. ;
Zweig, A. S. ;
Wang, T. ;
Smith, K. E. ;
Rosenbloom, K. R. ;
Rhead, B. ;
Raney, B. J. ;
Pohl, A. ;
Pheasant, M. ;
Meyer, L. ;
Hsu, F. ;
Hinrichs, A. S. ;
Harte, R. A. ;
Giardine, B. ;
Fujita, P. ;
Diekhans, M. ;
Dreszer, T. ;
Clawson, H. ;
Barber, G. P. ;
Haussler, D. ;
Kent, W. J. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D755-D761
[6]   Computability of models for sequence assembly [J].
Medvedev, Paul ;
Georgiou, Konstantinos ;
Myers, Gene ;
Brudno, Michael .
ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2007, 4645 :289-301
[7]  
Peterlongo P, 2011, RR7565 INRIA
[8]   An Eulerian path approach to DNA fragment assembly [J].
Pevzner, PA ;
Tang, HX ;
Waterman, MS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (17) :9748-9753
[9]  
Pevzner PA., 2004, RECOMB 2004 P, P213
[10]   MetaSim-A Sequencing Simulator for Genomics and Metagenomics [J].
Richter, Daniel C. ;
Ott, Felix ;
Auch, Alexander F. ;
Schmid, Ramona ;
Huson, Daniel H. .
PLOS ONE, 2008, 3 (10)