Epigenetic priors for identifying active transcription factor binding sites

被引:81
作者
Cuellar-Partida, Gabriel [1 ]
Buske, Fabian A. [1 ]
McLeay, Robert C. [1 ]
Whitington, Tom [1 ]
Noble, William Stafford [2 ,3 ]
Bailey, Timothy L. [1 ]
机构
[1] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
CIS-REGULATORY MODULES; I HYPERSENSITIVE SITES; CHROMATIN SIGNATURES; 5; ENDS; PREDICTION; PROMOTERS; ENHANCERS; MOTIFS; GENES;
D O I
10.1093/bioinformatics/btr614
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Accurate knowledge of the genome-wide binding of transcription factors in a particular cell type or under a particular condition is necessary for understanding transcriptional regulation. Using epigenetic data such as histone modification and DNase I, accessibility data has been shown to improve motif-based in silico methods for predicting such binding, but this approach has not yet been fully explored. Results We describe a probabilistic method for combining one or more tracks of epigenetic data with a standard DNA sequence motif model to improve our ability to identify active transcription factor binding sites (TFBSs). We convert each data type into a position-specific probabilistic prior and combine these priors with a traditional probabilistic motif model to compute a log-posterior odds score. Our experiments, using histone modifications H3K4me1, H3K4me3, H3K9ac and H3K27ac, as well as DNase I sensitivity, show conclusively that the log-posterior odds score consistently outperforms a simple binary filter based on the same data. We also show that our approach performs competitively with a more complex method, CENTIPEDE, and suggest that the relative simplicity of the log-posterior odds scoring method makes it an appealing and very general method for identifying functional TFBSs on the basis of DNA and epigenetic evidence.
引用
收藏
页码:56 / 62
页数:7
相关论文
共 29 条
[21]   Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data [J].
Pique-Regi, Roger ;
Degner, Jacob F. ;
Pai, Athma A. ;
Gaffney, Daniel J. ;
Gilad, Yoav ;
Pritchard, Jonathan K. .
GENOME RESEARCH, 2011, 21 (03) :447-455
[22]   Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing [J].
Robertson, Gordon ;
Hirst, Martin ;
Bainbridge, Matthew ;
Bilenky, Misha ;
Zhao, Yongjun ;
Zeng, Thomas ;
Euskirchen, Ghia ;
Bernier, Bridget ;
Varhol, Richard ;
Delaney, Allen ;
Thiessen, Nina ;
Griffith, Obi L. ;
He, Ann ;
Marra, Marco ;
Snyder, Michael ;
Jones, Steven .
NATURE METHODS, 2007, 4 (08) :651-657
[23]   Systematic functional characterization of cis-regulatory motifs in human core promoters [J].
Sinha, Saurabh ;
Adler, Adam S. ;
Field, Yair ;
Chang, Howard Y. ;
Segal, Eran .
GENOME RESEARCH, 2008, 18 (03) :477-488
[24]   MEASURING THE ACCURACY OF DIAGNOSTIC SYSTEMS [J].
SWETS, JA .
SCIENCE, 1988, 240 (4857) :1285-1293
[25]   High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites [J].
Whitington, Tom ;
Perkins, Andrew C. ;
Bailey, Timothy L. .
NUCLEIC ACIDS RESEARCH, 2009, 37 (01) :14-25
[26]   Genome-wide prediction of transcription factor binding sites using an integrated model [J].
Won, Kyoung-Jae ;
Ren, Bing ;
Wang, Wei .
GENOME BIOLOGY, 2010, 11 (01)
[27]   An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome [J].
Won, Kyoung-Jae ;
Agarwal, Saurabh ;
Shen, Li ;
Shoemaker, Robert ;
Ren, Bing ;
Wang, Wei .
PLOS ONE, 2009, 4 (05)
[28]   THE 5' ENDS OF DROSOPHILA HEAT-SHOCK GENES IN CHROMATIN ARE HYPERSENSITIVE TO DNASE-I [J].
WU, C .
NATURE, 1980, 286 (5776) :854-860
[29]   CisModule:: De novo discovery of' cis-regulatory modules by hierarchical mixture modeling [J].
Zhou, Q ;
Wong, WH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (33) :12114-12119