Understanding transcriptional regulation by integrative analysis of transcription factor binding data

被引:132
作者
Cheng, Chao [1 ,2 ]
Alexander, Roger [1 ,2 ]
Min, Renqiang [1 ,2 ]
Leng, Jing [2 ]
Yip, Kevin Y. [1 ,2 ,3 ]
Rozowsky, Joel [1 ,2 ]
Yan, Koon-Kiu [1 ,2 ]
Dong, Xianjun [4 ]
Djebali, Sarah [5 ,10 ]
Ruan, Yijun [6 ]
Davis, Carrie A. [7 ]
Carninci, Piero [8 ]
Lassman, Timo [8 ]
Gingerasi, Thomas R. [7 ]
Guigo, Roderic [5 ,10 ]
Birney, Ewan [9 ]
Weng, Zhiping [4 ]
Snyder, Michael [11 ]
Gerstein, Mark [1 ,2 ,12 ]
机构
[1] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[2] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R China
[4] Univ Massachusetts, Sch Med, Program Bioinformat & Integrat Biol, Dept Biochem & Mol Pharmacol, Worcester, MA 01655 USA
[5] Ctr Genom Regulat CRG, Barcelona 08003, Spain
[6] Genome Inst Singapore, Singapore 138672, Singapore
[7] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
[8] Yokohama Inst, RIKEN Omics Sci Ctr, Yokohama, Kanagawa 2300045, Japan
[9] European Bioinformat Inst EMBL EBI, Hinxton CB10 1SD, Cambs, England
[10] UPF, Barcelona 08003, Spain
[11] Stanford Univ, Dept Genet, Sch Med, Stanford, CA 94305 USA
[12] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
关键词
GENE-EXPRESSION; HUMAN GENOME; CHIP-SEQ; CHROMATIN; ELEMENTS; MICROARRAY; EVOLUTION; PATTERNS;
D O I
10.1101/gr.136838.111
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Statistical models have been used to quantify the relationship between gene expression and transcription factor (TF) binding signals. Here we apply the models to the large-scale data generated by the ENCODE project to study transcriptional regulation by TFs. Our results reveal a notable difference in the prediction accuracy of expression levels of transcription start sites (TSSs) captured by different technologies and RNA extraction protocols. In general, the expression levels of TSSs with high CpG content are more predictable than those with low CpG content. For genes with alternative TSSs, the expression levels of downstream TSSs are more predictable than those of the upstream ones. Different TF categories and specific TFs vary substantially in their contributions to predicting expression. Between two cell lines, the differential expression of TSS can be precisely reflected by the difference of IF-binding signals in a quantitative manner, arguing against the conventional on-and-off model of IF binding. Finally, we explore the relationships between TF-binding signals and other chromatin features such as histone modifications and DNase hypersensitivity for determining expression. The models imply that these features regulate transcription in a highly coordinated manner.
引用
收藏
页码:1658 / 1667
页数:10
相关论文
共 54 条
[1]  
[Anonymous], NATURE IN PRESS
[2]   Structure and evolution of transcriptional regulatory networks [J].
Babu, MM ;
Luscombe, NM ;
Aravind, L ;
Gerstein, M ;
Teichmann, SA .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) :283-291
[3]   Animal Transcription Networks as Highly Connected, Quantitative Continua [J].
Biggin, Mark D. .
DEVELOPMENTAL CELL, 2011, 21 (04) :611-626
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   CPG methylation as a mechanism for the regulation of E2F activity [J].
Campanero, MR ;
Armstrong, MI ;
Flemington, EK .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (12) :6481-6486
[6]   Modeling the relative relationship of transcription factor binding and histone modifications to gene expression levels in mouse embryonic stem cells [J].
Cheng, Chao ;
Gerstein, Mark .
NUCLEIC ACIDS RESEARCH, 2012, 40 (02) :553-568
[7]   TIP: A probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles [J].
Cheng, Chao ;
Min, Renqiang ;
Gerstein, Mark .
BIOINFORMATICS, 2011, 27 (23) :3221-3227
[8]   A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets [J].
Cheng, Chao ;
Yan, Koon-Kiu ;
Yip, Kevin Y. ;
Rozowsky, Joel ;
Alexander, Roger ;
Shou, Chong ;
Gerstein, Mark .
GENOME BIOLOGY, 2011, 12 (02)
[9]   Integrating regulatory motif discovery and genome-wide expression analysis [J].
Conlon, EM ;
Liu, XS ;
Lieb, JD ;
Liu, JS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (06) :3339-3344
[10]   The functional consequences of alternative promoter use in mammalian genomes [J].
Davuluri, Ramana V. ;
Suzuki, Yutaka ;
Sugano, Sumio ;
Plass, Christoph ;
Huang, Tim H. -M. .
TRENDS IN GENETICS, 2008, 24 (04) :167-177