Detection of splice junctions from paired-end RNA-seq data by SpliceMap

被引:199
作者
Au, Kin Fai [1 ]
Jiang, Hui [1 ,2 ]
Lin, Lan [3 ,4 ]
Xing, Yi [3 ,4 ]
Wong, Wing Hung [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[2] Stanford Genome Technol Ctr, Palo Alto, CA 94304 USA
[3] Univ Iowa, Dept Internal Med, Iowa City, IA 52242 USA
[4] Univ Iowa, Dept Biomed Engn, Iowa City, IA 52242 USA
基金
美国国家卫生研究院;
关键词
GENE; TRANSCRIPTOME;
D O I
10.1093/nar/gkq211
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Alternative splicing is a prevalent post-transcriptional process, which is not only important to normal cellular function but is also involved in human diseases. The newly developed second generation sequencing technique provides high-throughput data (RNA-seq data) to study alternative splicing events in different types of cells. Here, we present a computational method, SpliceMap, to detect splice junctions from RNA-seq data. This method does not depend on any existing annotation of gene structures and is capable of finding novel splice junctions with high sensitivity and specificity. It can handle long reads (50-100 nt) and can exploit paired-read information to improve mapping accuracy. Several parameters are included in the output to indicate the reliability of the predicted junction and help filter out false predictions. We applied SpliceMap to analyze 23 million paired 50-nt reads from human brain tissue. The results show at this depth of sequencing, RNA-seq can support reliable detection of splice junctions except for those that are present at very low level. Compared to current methods, SpliceMap can achieve 12% higher sensitivity without sacrificing specificity.
引用
收藏
页码:4570 / 4578
页数:9
相关论文
共 22 条
  • [1] RAPID CDNA SEQUENCING (EXPRESSED SEQUENCE TAGS) FROM A DIRECTIONALLY CLONED HUMAN INFANT BRAIN CDNA LIBRARY
    ADAMS, MD
    SOARES, MB
    KERLAVAGE, AR
    FIELDS, C
    VENTER, JC
    [J]. NATURE GENETICS, 1993, 4 (04) : 373 - 386
  • [2] DBEST - DATABASE FOR EXPRESSED SEQUENCE TAGS
    BOGUSKI, MS
    LOWE, TMJ
    TOLSTOSHEV, CM
    [J]. NATURE GENETICS, 1993, 4 (04) : 332 - 333
  • [3] Analysis of canonical and non-canonical splice sites in mammalian genomes
    Burset, M
    Seledtsov, IA
    Solovyev, VV
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (21) : 4364 - 4375
  • [4] Stem cell transcriptome profiling via massive-scale mRNA sequencing
    Cloonan, Nicole
    Forrest, Alistair R. R.
    Kolle, Gabriel
    Gardiner, Brooke B. A.
    Faulkner, Geoffrey J.
    Brown, Mellissa K.
    Taylor, Darrin F.
    Steptoe, Anita L.
    Wani, Shivangi
    Bethel, Graeme
    Robertson, Alan J.
    Perkins, Andrew C.
    Bruce, Stephen J.
    Lee, Clarence C.
    Ranade, Swati S.
    Peckham, Heather E.
    Manning, Jonathan M.
    McKernan, Kevin J.
    Grimmond, Sean M.
    [J]. NATURE METHODS, 2008, 5 (07) : 613 - 619
  • [5] The Ensembl automatic gene annotation system
    Curwen, V
    Eyras, E
    Andrews, TD
    Clarke, L
    Mongin, E
    Searle, SMJ
    Clamp, M
    [J]. GENOME RESEARCH, 2004, 14 (05) : 942 - 950
  • [6] The UCSC Known Genes
    Hsu, F
    Kent, WJ
    Clawson, H
    Kuhn, RM
    Diekhans, M
    Haussler, D
    [J]. BIOINFORMATICS, 2006, 22 (09) : 1036 - 1046
  • [7] SeqMap: mapping massive amount of oligonucleotides to the genome
    Jiang, Hui
    Wong, Wing Hung
    [J]. BIOINFORMATICS, 2008, 24 (20) : 2395 - 2396
  • [8] Highly integrated single-base resolution maps of the epigenome in Arabidopsis
    Lister, Ryan
    O'Malley, Ronan C.
    Tonti-Filippini, Julian
    Gregory, Brian D.
    Berry, Charles C.
    Millar, A. Harvey
    Ecker, Joseph R.
    [J]. CELL, 2008, 133 (03) : 523 - 536
  • [9] RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays
    Marioni, John C.
    Mason, Christopher E.
    Mane, Shrikant M.
    Stephens, Matthew
    Gilad, Yoav
    [J]. GENOME RESEARCH, 2008, 18 (09) : 1509 - 1517
  • [10] Understanding alternative splicing: Towards a cellular code
    Matlin, AJ
    Clark, F
    Smith, CWJ
    [J]. NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2005, 6 (05) : 386 - 398