Most "Dark Matter'' Transcripts Are Associated With Known Genes

被引:313
作者
van Bakel, Harm [1 ]
Nislow, Corey [1 ,2 ]
Blencowe, Benjamin J. [1 ,2 ]
Hughes, Timothy R. [1 ,2 ]
机构
[1] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON, Canada
[2] Univ Toronto, Dept Mol Genet, Toronto, ON, Canada
来源
PLOS BIOLOGY | 2010年 / 8卷 / 05期
基金
加拿大健康研究院;
关键词
NONCODING RNAS; HUMAN GENOME; CHROMATIN-STRUCTURE; HUMAN PROMOTERS; HUMAN-CELLS; SEQ; IDENTIFICATION; RESOLUTION; ELEMENTS; REVEALS;
D O I
10.1371/journal.pbio.1000371
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A series of reports over the last few years have indicated that a much larger portion of the mammalian genome is transcribed than can be accounted for by currently annotated genes, but the quantity and nature of these additional transcripts remains unclear. Here, we have used data from single- and paired-end RNA-Seq and tiling arrays to assess the quantity and composition of transcripts in PolyA+ RNA from human and mouse tissues. Relative to tiling arrays, RNA-Seq identifies many fewer transcribed regions ("seqfrags'') outside known exons and ncRNAs. Most nonexonic seqfrags are in introns, raising the possibility that they are fragments of pre-mRNAs. The chromosomal locations of the majority of intergenic seqfrags in RNA-Seq data are near known genes, consistent with alternative cleavage and polyadenylation site usage, promoter- and terminator-associated transcripts, or new alternative exons; indeed, reads that bridge splice sites identified 4,544 new exons, affecting 3,554 genes. Most of the remaining seqfrags correspond to either single reads that display characteristics of random sampling from a low-level background or several thousand small transcripts (median length = 111 bp) present at higher levels, which also tend to display sequence conservation and originate from regions with open chromatin. We conclude that, while there are bona fide new intergenic transcripts, their number and abundance is generally low in comparison to known exons, and the genome is not as pervasively transcribed as previously reported.
引用
收藏
页数:21
相关论文
共 66 条
  • [1] Armour CD, 2009, NAT METHODS, V6, P647, DOI [10.1038/NMETH.1360, 10.1038/nmeth.1360]
  • [2] Mapping accessible chromatin regions using Sono-Seq
    Auerbach, Raymond K.
    Euskirchen, Ghia
    Rozowsky, Joel
    Lamarre-Vincent, Nathan
    Moqtaderi, Zarmik
    Lefrancois, Philippe
    Struhl, Kevin
    Gerstein, Mark
    Snyder, Michael
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (35) : 14926 - 14931
  • [3] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [4] Global identification of human transcribed sequences with genome tiling arrays
    Bertone, P
    Stolc, V
    Royce, TE
    Rozowsky, JS
    Urban, AE
    Zhu, XW
    Rinn, JL
    Tongprasit, W
    Samanta, M
    Weissman, S
    Gerstein, M
    Snyder, M
    [J]. SCIENCE, 2004, 306 (5705) : 2242 - 2246
  • [5] Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Birney, Ewan
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigo, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Stamatoyannopoulos, John A.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C. J.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    [J]. NATURE, 2007, 447 (7146) : 799 - 816
  • [6] Waste not, want not - transcript excess in multicellular eukaryotes
    Brosius, J
    [J]. TRENDS IN GENETICS, 2005, 21 (05) : 287 - 288
  • [7] The transcriptional landscape of the mammalian genome
    Carninci, P
    Kasukawa, T
    Katayama, S
    Gough, J
    Frith, MC
    Maeda, N
    Oyama, R
    Ravasi, T
    Lenhard, B
    Wells, C
    Kodzius, R
    Shimokawa, K
    Bajic, VB
    Brenner, SE
    Batalov, S
    Forrest, ARR
    Zavolan, M
    Davis, MJ
    Wilming, LG
    Aidinis, V
    Allen, JE
    Ambesi-Impiombato, X
    Apweiler, R
    Aturaliya, RN
    Bailey, TL
    Bansal, M
    Baxter, L
    Beisel, KW
    Bersano, T
    Bono, H
    Chalk, AM
    Chiu, KP
    Choudhary, V
    Christoffels, A
    Clutterbuck, DR
    Crowe, ML
    Dalla, E
    Dalrymple, BP
    de Bono, B
    Della Gatta, G
    di Bernardo, D
    Down, T
    Engstrom, P
    Fagiolini, M
    Faulkner, G
    Fletcher, CF
    Fukushima, T
    Furuno, M
    Futaki, S
    Gariboldi, M
    [J]. SCIENCE, 2005, 309 (5740) : 1559 - 1563
  • [8] Genome-wide analysis of mammalian promoter architecture and evolution
    Carninci, Piero
    Sandelin, Albin
    Lenhard, Boris
    Katayama, Shintaro
    Shimokawa, Kazuro
    Ponjavic, Jasmina
    Semple, Colin A. M.
    Taylor, Martin S.
    Engström, Par G.
    Frith, Martin C.
    Forrest, Alistair R. R.
    Alkema, Wynand B.
    Tan, Sin Lam
    Plessy, Charles
    Kodzius, Rimantas
    Ravasi, Timothy
    Kasukawa, Takeya
    Fukuda, Shiro
    Kanamori-Katayama, Mutsumi
    Kitazume, Yayoi
    Kawaji, Hideya
    Kai, Chikatoshi
    Nakamura, Mari
    Konno, Hideaki
    Nakano, Kenji
    Mottagui-Tabar, Salim
    Arner, Peter
    Chesi, Alessandra
    Gustincich, Stefano
    Persichetti, Francesca
    Suzuki, Harukazu
    Grimmond, Sean M.
    Wells, Christine A.
    Orlando, Valerio
    Wahlestedt, Claes
    Liu, Edison T.
    Harbers, Matthias
    Kawai, Jun
    Bajic, Vladimir B.
    Hume, David A.
    Hayashizaki, Yoshihide
    [J]. NATURE GENETICS, 2006, 38 (06) : 626 - 635
  • [9] Unlocking the secrets of the genome
    Celniker, Susan E.
    Dillon, Laura A. L.
    Gerstein, Mark B.
    Gunsalus, Kristin C.
    Henikoff, Steven
    Karpen, Gary H.
    Kellis, Manolis
    Lai, Eric C.
    Lieb, Jason D.
    MacAlpine, David M.
    Micklem, Gos
    Piano, Fabio
    Snyder, Michael
    Stein, Lincoln
    White, Kevin P.
    Waterston, Robert H.
    [J]. NATURE, 2009, 459 (7249) : 927 - 930
  • [10] Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution
    Cheng, J
    Kapranov, P
    Drenkow, J
    Dike, S
    Brubaker, S
    Patel, S
    Long, J
    Stern, D
    Tammana, H
    Helt, G
    Sementchenko, V
    Piccolboni, A
    Bekiranov, S
    Bailey, DK
    Ganesh, M
    Ghosh, S
    Bell, I
    Gerhard, DS
    Gingeras, TR
    [J]. SCIENCE, 2005, 308 (5725) : 1149 - 1154