Prediction of RNA Polymerase II recruitment, elongation and stalling from histone modification data

被引:26
作者
Chen, Yun [1 ,2 ]
Jorgensen, Mette [1 ,2 ]
Kolde, Raivo [3 ,4 ]
Zhao, Xiaobei [1 ,2 ]
Parker, Brian [1 ,2 ]
Valen, Eivind [1 ,2 ]
Wen, Jiayu [1 ,2 ]
Sandelin, Albin [1 ,2 ]
机构
[1] Univ Copenhagen, Bioinformat Ctr, Dept Biol, DK-2200 Copenhagen, Denmark
[2] Univ Copenhagen, Biotech Res & Innovat Ctr, DK-2200 Copenhagen, Denmark
[3] Univ Tartu, Inst Comp Sci, EE-50409 Tartu, Estonia
[4] Quretec, EE-51003 Tartu, Estonia
来源
BMC GENOMICS | 2011年 / 12卷
基金
欧洲研究理事会;
关键词
GENOME-WIDE ANALYSIS; TRANSCRIPTION INITIATION; ACTIVE PROMOTERS; CELL-LINE; CHROMATIN; METHYLATION; H3; ACETYLATION; DISTINCT; PAUSE;
D O I
10.1186/1471-2164-12-544
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Initiation and elongation of RNA polymerase II (RNAPII) transcription is regulated by both DNA sequence and chromatin signals. Recent breakthroughs make it possible to measure the chromatin state and activity of core promoters genome-wide, but dedicated computational strategies are needed to progress from descriptive annotation of data to quantitative, predictive models. Results: Here, we describe a computational framework which with high accuracy can predict the locations of core promoters, the amount of recruited RNAPII at the promoter, the amount of elongating RNAPII in the gene body, the mRNA production originating from the promoter and finally also the stalling characteristics of RNAPII by considering both quantitative and spatial features of histone modifications around the transcription start site (TSS). As the model framework can also pinpoint the signals that are the most influential for prediction, it can be used to infer underlying regulatory biology. For example, we show that the H3K4 di- and tri-methylation signals are strongly predictive for promoter location while the acetylation marks H3K9 and H3K27 are highly important in estimating the promoter usage. All of these four marks are found to be necessary for recruitment of RNAPII but not sufficient for the elongation. We also show that the spatial distributions of histone marks are almost as predictive as the signal strength and that a set of histone marks immediately downstream of the TSS is highly predictive of RNAPII stalling. Conclusions: In this study we introduce a general framework to accurately predict the level of RNAPII recruitment, elongation, stalling and mRNA expression from chromatin signals. The versatility of the method also makes it ideally suited to investigate other genomic data.
引用
收藏
页数:16
相关论文
共 65 条
[1]   A global genomic transcriptional code associated with CNS-expressed genes [J].
Bailey, Peter J. ;
Klos, Joanna M. ;
Andersson, Elisabet ;
Karlen, Mattias ;
Kallstrom, Magdalena ;
Ponjavic, Jasmina ;
Muhr, Jonas ;
Lenhard, Boris ;
Sandelin, Albin ;
Ericson, Johan .
EXPERIMENTAL CELL RESEARCH, 2006, 312 (16) :3108-3119
[2]   Spatial distribution of di- and tri-methyl lysine 36 of histone H3 at active genes [J].
Bannister, AJ ;
Schneider, R ;
Myers, FA ;
Thorne, AW ;
Crane-Robinson, C ;
Kouzarides, T .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2005, 280 (18) :17732-17736
[3]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[4]   The complex language of chromatin regulation during transcription [J].
Berger, Shelley L. .
NATURE, 2007, 447 (7143) :407-412
[5]   The mammalian epigenome [J].
Bernstein, Bradley E. ;
Meissner, Alexander ;
Lander, Eric S. .
CELL, 2007, 128 (04) :669-681
[6]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[7]   F-Seq: a feature density estimator for high-throughput sequence tags [J].
Boyle, Alan P. ;
Guinney, Justin ;
Crawford, Gregory E. ;
Furey, Terrence S. .
BIOINFORMATICS, 2008, 24 (21) :2537-2538
[8]   High-resolution mapping and characterization of open chromatin across the genome [J].
Boyle, Alan P. ;
Davis, Sean ;
Shulha, Hennady P. ;
Meltzer, Paul ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Furey, Terrence S. ;
Crawford, Gregory E. .
CELL, 2008, 132 (02) :311-322
[9]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[10]   Distinct DNA methylation patterns characterize differentiated human embryonic stem cells and developing human fetal liver [J].
Brunner, Alayne L. ;
Johnson, David S. ;
Kim, Si Wan ;
Valouev, Anton ;
Reddy, Timothy E. ;
Neff, Norma F. ;
Anton, Elizabeth ;
Medina, Catherine ;
Nguyen, Loan ;
Chiao, Eric ;
Oyolu, Chuba B. ;
Schroth, Gary P. ;
Absher, Devin M. ;
Baker, Julie C. ;
Myers, Richard M. .
GENOME RESEARCH, 2009, 19 (06) :1044-1056