Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome

被引:46
作者
QianWu, Jia [1 ]
Du, Jiang [2 ]
Rozowsky, Joel [3 ]
Zhang, Zhengdong [3 ]
Urban, Alexander E. [1 ]
Euskirchen, Ghia [1 ]
Weissman, Sherman [4 ]
Gerstein, Mark [3 ]
Snyder, Michael [2 ,3 ]
机构
[1] Yale Univ, Mol Cellular & Dev Biol Dept, New Haven, CT 06511 USA
[2] Yale Univ, Dept Comp Sci, New Haven, CT 06511 USA
[3] Yale Univ, Dept Biochem & Mol Biophys, New Haven, CT 06511 USA
[4] Yale Univ, Dept Genet, New Haven, CT 06511 USA
关键词
D O I
10.1186/gb-2008-9-1-r3
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. Results: We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. Conclusion: We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional.
引用
收藏
页数:14
相关论文
共 45 条
[1]   Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-by-synthesis approach [J].
Bainbridge, Matthew N. ;
Warren, Rene L. ;
Hirst, Martin ;
Romanuik, Tammy ;
Zeng, Thomas ;
Go, Anne ;
Delaney, Allen ;
Griffith, Malachi ;
Hickenbotham, Matthew ;
Magrini, Vincent ;
Mardis, Elaine R. ;
Sadar, Marianne D. ;
Siddiqui, Asim S. ;
Marra, Marco A. ;
Jones, Steven J. M. .
BMC GENOMICS, 2006, 7 (1)
[2]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[3]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[4]   Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays [J].
Brenner, S ;
Johnson, M ;
Bridgham, J ;
Golda, G ;
Lloyd, DH ;
Johnson, D ;
Luo, SJ ;
McCurdy, S ;
Foy, M ;
Ewan, M ;
Roth, R ;
George, D ;
Eletr, S ;
Albrecht, G ;
Vermaas, E ;
Williams, SR ;
Moon, K ;
Burcham, T ;
Pallas, M ;
DuBridge, RB ;
Kirchner, J ;
Fearon, K ;
Mao, J ;
Corcoran, K .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :630-634
[5]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563
[6]   Genome-wide analysis of mammalian promoter architecture and evolution [J].
Carninci, Piero ;
Sandelin, Albin ;
Lenhard, Boris ;
Katayama, Shintaro ;
Shimokawa, Kazuro ;
Ponjavic, Jasmina ;
Semple, Colin A. M. ;
Taylor, Martin S. ;
Engström, Par G. ;
Frith, Martin C. ;
Forrest, Alistair R. R. ;
Alkema, Wynand B. ;
Tan, Sin Lam ;
Plessy, Charles ;
Kodzius, Rimantas ;
Ravasi, Timothy ;
Kasukawa, Takeya ;
Fukuda, Shiro ;
Kanamori-Katayama, Mutsumi ;
Kitazume, Yayoi ;
Kawaji, Hideya ;
Kai, Chikatoshi ;
Nakamura, Mari ;
Konno, Hideaki ;
Nakano, Kenji ;
Mottagui-Tabar, Salim ;
Arner, Peter ;
Chesi, Alessandra ;
Gustincich, Stefano ;
Persichetti, Francesca ;
Suzuki, Harukazu ;
Grimmond, Sean M. ;
Wells, Christine A. ;
Orlando, Valerio ;
Wahlestedt, Claes ;
Liu, Edison T. ;
Harbers, Matthias ;
Kawai, Jun ;
Bajic, Vladimir B. ;
Hume, David A. ;
Hayashizaki, Yoshihide .
NATURE GENETICS, 2006, 38 (06) :626-635
[7]   Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution [J].
Cheng, J ;
Kapranov, P ;
Drenkow, J ;
Dike, S ;
Brubaker, S ;
Patel, S ;
Long, J ;
Stern, D ;
Tammana, H ;
Helt, G ;
Sementchenko, V ;
Piccolboni, A ;
Bekiranov, S ;
Bailey, DK ;
Ganesh, M ;
Ghosh, S ;
Bell, I ;
Gerhard, DS ;
Gingeras, TR .
SCIENCE, 2005, 308 (5725) :1149-1154
[8]   Genome-wide mutant collections: toolboxes for functional genomics [J].
Coelho, PSR ;
Kumar, A ;
Snyder, M .
CURRENT OPINION IN MICROBIOLOGY, 2000, 3 (03) :309-315
[9]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[10]   Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome [J].
Cooper, SJ ;
Trinklein, ND ;
Anton, ED ;
Nguyen, L ;
Myers, RM .
GENOME RESEARCH, 2006, 16 (01) :1-10