A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling

被引:100
作者
Bradford, James R.
Hey, Yvonne [1 ]
Yates, Tim
Li, Yaoyong
Pepper, Stuart D. [1 ]
Miller, Crispin J. [1 ]
机构
[1] Univ Manchester, Canc Res UK, Paterson Inst Canc Res, Mol Biol Core Facil, Manchester M20 4BX, Lancs, England
来源
BMC GENOMICS | 2010年 / 11卷
关键词
GENE-EXPRESSION; RNA-SEQ; AFFYMETRIX EXON; GENOME; ARRAYS; IDENTIFICATION; BIOCONDUCTOR; BIOLOGY; CELL; MAP;
D O I
10.1186/1471-2164-11-282
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: RNA-Seq exploits the rapid generation of gigabases of sequence data by Massively Parallel Nucleotide Sequencing, allowing for the mapping and digital quantification of whole transcriptomes. Whilst previous comparisons between RNA-Seq and microarrays have been performed at the level of gene expression, in this study we adopt a more fine-grained approach. Using RNA samples from a normal human breast epithelial cell line (MCF-10a) and a breast cancer cell line (MCF-7), we present a comprehensive comparison between RNA-Seq data generated on the Applied Biosystems SOLiD platform and data from Affymetrix Exon 1.0ST arrays. The use of Exon arrays makes it possible to assess the performance of RNA-Seq in two key areas: detection of expression at the granularity of individual exons, and discovery of transcription outside annotated loci. Results: We found a high degree of correspondence between the two platforms in terms of exon-level fold changes and detection. For example, over 80% of exons detected as expressed in RNA-Seq were also detected on the Exon array, and 91% of exons flagged as changing from Absent to Present on at least one platform had fold-changes in the same direction. The greatest detection correspondence was seen when the read count threshold at which to flag exons Absent in the SOLiD data was set to t < 1 suggesting that the background error rate is extremely low in RNA-Seq. We also found RNA-Seq more sensitive to detecting differentially expressed exons than the Exon array, reflecting the wider dynamic range achievable on the SOLiD platform. In addition, we find significant evidence of novel protein coding regions outside known exons, 93% of which map to Exon array probesets, and are able to infer the presence of thousands of novel transcripts through the detection of previously unreported exon-exon junctions. Conclusions: By focusing on exon-level expression, we present the most fine-grained comparison between RNA-Seq and microarrays to date. Overall, our study demonstrates that data from a SOLiD RNA-Seq experiment are sufficient to generate results comparable to those produced from Affymetrix Exon arrays, even using only a single replicate from each platform, and when presented with a large genome.
引用
收藏
页数:12
相关论文
共 32 条
  • [1] Affymetrix, 2005, EX ARR BACKGR CORR
  • [2] The significance of digital gene expression profiles
    Audic, S
    Claverie, JM
    [J]. GENOME RESEARCH, 1997, 7 (10): : 986 - 995
  • [3] Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Birney, Ewan
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigo, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Stamatoyannopoulos, John A.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C. J.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    [J]. NATURE, 2007, 447 (7146) : 799 - 816
  • [4] Stem cell transcriptome profiling via massive-scale mRNA sequencing
    Cloonan, Nicole
    Forrest, Alistair R. R.
    Kolle, Gabriel
    Gardiner, Brooke B. A.
    Faulkner, Geoffrey J.
    Brown, Mellissa K.
    Taylor, Darrin F.
    Steptoe, Anita L.
    Wani, Shivangi
    Bethel, Graeme
    Robertson, Alan J.
    Perkins, Andrew C.
    Bruce, Stephen J.
    Lee, Clarence C.
    Ranade, Swati S.
    Peckham, Heather E.
    Manning, Jonathan M.
    McKernan, Kevin J.
    Grimmond, Sean M.
    [J]. NATURE METHODS, 2008, 5 (07) : 613 - 619
  • [5] Annotating genomes with massive-scale RNA sequencing
    Denoeud, France
    Aury, Jean-Marc
    Da Silva, Corinne
    Noel, Benjamin
    Rogier, Odile
    Delledonne, Massimo
    Morgante, Michele
    Valle, Giorgio
    Wincker, Patrick
    Scarpelli, Claude
    Jaillon, Olivier
    Artiguenave, Francois
    [J]. GENOME BIOLOGY, 2008, 9 (12)
  • [6] The Pfam protein families database
    Finn, Robert D.
    Tate, John
    Mistry, Jaina
    Coggill, Penny C.
    Sammut, Stephen John
    Hotz, Hans-Rudolf
    Ceric, Goran
    Forslund, Kristoffer
    Eddy, Sean R.
    Sonnhammer, Erik L. L.
    Bateman, Alex
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D281 - D288
  • [7] Identification of differentially regulated splice variants and novel exons in glial brain tumors using exon expression arrays
    French, Pim J.
    Peeters, Justine
    Horsman, Sebastiaan
    Duijm, Elza
    Siccama, Ivar
    van den Bent, Martin J.
    Luider, Theo M.
    Kros, Johan M.
    van der Spek, Peter
    Smitt, Peter A. Sillevis
    [J]. CANCER RESEARCH, 2007, 67 (12) : 5635 - 5642
  • [8] Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array
    Gardina, Paul J.
    Clark, Tyson A.
    Shimada, Brian
    Staples, Michelle K.
    Yang, Qing
    Veitch, James
    Schweitzer, Anthony
    Awad, Tarif
    Sugnet, Charles
    Dee, Suzanne
    Davies, Christopher
    Williams, Alan
    Turpaz, Yaron
    [J]. BMC GENOMICS, 2006, 7 (1)
  • [9] Bioconductor: open software development for computational biology and bioinformatics
    Gentleman, RC
    Carey, VJ
    Bates, DM
    Bolstad, B
    Dettling, M
    Dudoit, S
    Ellis, B
    Gautier, L
    Ge, YC
    Gentry, J
    Hornik, K
    Hothorn, T
    Huber, W
    Iacus, S
    Irizarry, R
    Leisch, F
    Li, C
    Maechler, M
    Rossini, AJ
    Sawitzki, G
    Smith, C
    Smyth, G
    Tierney, L
    Yang, JYH
    Zhang, JH
    [J]. GENOME BIOLOGY, 2004, 5 (10)
  • [10] Ensembl 2009
    Hubbard, T. J. P.
    Aken, B. L.
    Ayling, S.
    Ballester, B.
    Beal, K.
    Bragin, E.
    Brent, S.
    Chen, Y.
    Clapham, P.
    Clarke, L.
    Coates, G.
    Fairley, S.
    Fitzgerald, S.
    Fernandez-Banet, J.
    Gordon, L.
    Graf, S.
    Haider, S.
    Hammond, M.
    Holland, R.
    Howe, K.
    Jenkinson, A.
    Johnson, N.
    Kahari, A.
    Keefe, D.
    Keenan, S.
    Kinsella, R.
    Kokocinski, F.
    Kulesha, E.
    Lawson, D.
    Longden, I.
    Megy, K.
    Meidl, P.
    Overduin, B.
    Parker, A.
    Pritchard, B.
    Rios, D.
    Schuster, M.
    Slater, G.
    Smedley, D.
    Spooner, W.
    Spudich, G.
    Trevanion, S.
    Vilella, A.
    Vogel, J.
    White, S.
    Wilder, S.
    Zadissa, A.
    Birney, E.
    Cunningham, F.
    Curwen, V.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D690 - D697