MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery

被引:721
作者
Wang, Kai [1 ]
Singh, Darshan [2 ]
Zeng, Zheng [1 ]
Coleman, Stephen J. [3 ]
Huang, Yan [1 ]
Savich, Gleb L. [4 ,5 ]
He, Xiaping [4 ,5 ]
Mieczkowski, Piotr [4 ,5 ]
Grimm, Sara A. [4 ,5 ]
Perou, Charles M. [4 ,5 ]
MacLeod, James N. [3 ]
Chiang, Derek Y. [4 ,5 ]
Prins, Jan F. [2 ]
Liu, Jinze [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
[2] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA
[3] Univ Kentucky, Gluck Equine Res Ctr, Dept Vet Sci, Lexington, KY 40546 USA
[4] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[5] Univ N Carolina, UNC Lineberger Comprehens Canc Ctr, Chapel Hill, NC 27599 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
FULL-LENGTH ISOFORMS; FGFR2; MUTATIONS; ALIGNMENT; TRANSCRIPTOMES; EXPRESSION; ULTRAFAST; TOOL;
D O I
10.1093/nar/gkq622
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The accurate mapping of reads that span splice junctions is a critical component of all analytic techniques that work with RNA-seq data. We introduce a second generation splice detection algorithm, MapSplice, whose focus is high sensitivity and specificity in the detection of splices as well as CPU and memory efficiency. MapSplice can be applied to both short (< 75 bp) and long reads (epsilon 75 bp). MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. MapSplice leverages the quality and diversity of read alignments of a given splice to increase accuracy. We demonstrate that MapSplice achieves higher sensitivity and specificity than TopHat and SpliceMap on a set of simulated RNA-seq data. Experimental studies also support the accuracy of the algorithm. Splice junctions derived from eight breast cancer RNA-seq datasets recapitulated the extensiveness of alternative splicing on a global level as well as the differences between molecular subtypes of breast cancer. These combined results indicate that MapSplice is a highly accurate algorithm for the alignment of RNA-seq reads to splice junctions. Software download URL: http://www.netlab.uky.edu/p/bioinfo/MapSplice.
引用
收藏
页数:14
相关论文
共 36 条
[1]   A CONSERVED ALTERNATIVE SPLICE IN THE VONRECKLINGHAUSEN NEUROFIBROMATOSIS (NF1) GENE PRODUCES 2 NEUROFIBROMIN ISOFORMS, BOTH OF WHICH HAVE GTPASE-ACTIVATING PROTEIN-ACTIVITY [J].
ANDERSEN, LB ;
BALLESTER, R ;
MARCHUK, DA ;
CHANG, E ;
GUTMANN, DH ;
SAULINO, AM ;
CAMONIS, J ;
WIGLER, M ;
COLLINS, FS .
MOLECULAR AND CELLULAR BIOLOGY, 1993, 13 (01) :487-495
[2]   Detection of splice junctions from paired-end RNA-seq data by SpliceMap [J].
Au, Kin Fai ;
Jiang, Hui ;
Lin, Lan ;
Xing, Yi ;
Wong, Wing Hung .
NUCLEIC ACIDS RESEARCH, 2010, 38 (14) :4570-4578
[3]   Common intervals and sorting by reversals: a marriage of necessity [J].
Bergeron, A ;
Heber, S ;
Stoye, J .
BIOINFORMATICS, 2002, 18 :S54-S63
[4]   De novo transcriptome assembly with ABySS [J].
Birol, Inanc ;
Jackman, Shaun D. ;
Nielsen, Cydney B. ;
Qian, Jenny Q. ;
Varhol, Richard ;
Stazyk, Greg ;
Morin, Ryan D. ;
Zhao, Yongjun ;
Hirst, Martin ;
Schein, Jacqueline E. ;
Horsman, Doug E. ;
Connors, Joseph M. ;
Gascoyne, Randy D. ;
Marra, Marco A. ;
Jones, Steven J. M. .
BIOINFORMATICS, 2009, 25 (21) :2872-2877
[5]   Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines [J].
Castle, John C. ;
Zhang, Chaolin ;
Shah, Jyoti K. ;
Kulkarni, Amit V. ;
Kalsotra, Auinash ;
Cooper, Thomas A. ;
Johnson, Jason M. .
NATURE GENETICS, 2008, 40 (12) :1416-1425
[6]   Optimal spliced alignments of short sequence reads [J].
De Bona, Fabio ;
Ossowski, Stephan ;
Schneeberger, Korbinian ;
Raetsch, Gunnar .
BIOINFORMATICS, 2008, 24 (16) :I174-I180
[7]   Drug-sensitive FGFR2 mutations in endometrial carcinoma [J].
Dutt, Amit ;
Salvesen, Helga B. ;
Chent, Tzu-Hsiu ;
Ramos, Alex H. ;
Onofrio, Robert C. ;
Hatton, Charlie ;
Nicoletti, Richard ;
Winckler, Wendy ;
Grewal, Rupinder ;
Hanna, Megan ;
Wyhs, Nicolas ;
Ziaugra, Liuda ;
Richter, Daniel J. ;
Trovik, Jone ;
Engelsen, Ingeborg B. ;
Stefansson, Ingunn M. ;
Fennell, Tim ;
Cibulskis, Kristian ;
Zody, Michael C. ;
Akslen, Lars A. ;
Gabriel, Stacey ;
Wong, Kwok-Kin ;
Sellers, William R. ;
Meyerson, Matthew ;
Greulich, Heidi .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (25) :8713-8717
[8]   The multiple functions of Numb [J].
Gulino, Alberto ;
Di Marcotullio, Lucia ;
Screpanti, Isabella .
EXPERIMENTAL CELL RESEARCH, 2010, 316 (06) :900-906
[9]   BFAST: An Alignment Tool for Large Scale Genome Resequencing [J].
Homer, Nils ;
Merriman, Barry ;
Nelson, Stanley F. .
PLOS ONE, 2009, 4 (11) :A95-A106
[10]   Statistical inferences for isoform expression in RNA-Seq [J].
Jiang, Hui ;
Wong, Wing Hung .
BIOINFORMATICS, 2009, 25 (08) :1026-1032