Epigenetic priors for identifying active transcription factor binding sites

被引:81
作者
Cuellar-Partida, Gabriel [1 ]
Buske, Fabian A. [1 ]
McLeay, Robert C. [1 ]
Whitington, Tom [1 ]
Noble, William Stafford [2 ,3 ]
Bailey, Timothy L. [1 ]
机构
[1] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
CIS-REGULATORY MODULES; I HYPERSENSITIVE SITES; CHROMATIN SIGNATURES; 5; ENDS; PREDICTION; PROMOTERS; ENHANCERS; MOTIFS; GENES;
D O I
10.1093/bioinformatics/btr614
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Accurate knowledge of the genome-wide binding of transcription factors in a particular cell type or under a particular condition is necessary for understanding transcriptional regulation. Using epigenetic data such as histone modification and DNase I, accessibility data has been shown to improve motif-based in silico methods for predicting such binding, but this approach has not yet been fully explored. Results We describe a probabilistic method for combining one or more tracks of epigenetic data with a standard DNA sequence motif model to improve our ability to identify active transcription factor binding sites (TFBSs). We convert each data type into a position-specific probabilistic prior and combine these priors with a traditional probabilistic motif model to compute a log-posterior odds score. Our experiments, using histone modifications H3K4me1, H3K4me3, H3K9ac and H3K27ac, as well as DNase I sensitivity, show conclusively that the log-posterior odds score consistently outperforms a simple binary filter based on the same data. We also show that our approach performs competitively with a more complex method, CENTIPEDE, and suggest that the relative simplicity of the log-posterior odds scoring method makes it an appealing and very general method for identifying functional TFBSs on the basis of DNA and epigenetic evidence.
引用
收藏
页码:56 / 62
页数:7
相关论文
共 29 条
[1]  
[Anonymous], 2000, Pattern Classification
[2]   Searching for statistically significant regulatory modules [J].
Bailey, Timothy L. ;
Noble, William Stafford .
BIOINFORMATICS, 2003, 19 :II16-II25
[3]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[4]   Distant conserved sequences flanking endothelial-specific promoters contain tissue-specific DNase-hypersensitive sites and over-represented motifs [J].
Bernat, John A. ;
Crawford, Gregory E. ;
Ogurtsov, Aleksey Y. ;
Collins, Francis S. ;
Ginsburg, David ;
Kondrashov, Alexey S. .
HUMAN MOLECULAR GENETICS, 2006, 15 (13) :2098-2105
[5]   High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells [J].
Boyle, Alan P. ;
Song, Lingyun ;
Lee, Bum-Kyu ;
London, Darin ;
Keefe, Damian ;
Birney, Ewan ;
Iyer, Vishwanath R. ;
Crawford, Gregory E. ;
Furey, Terrence S. .
GENOME RESEARCH, 2011, 21 (03) :456-464
[6]   DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays [J].
Crawford, Gregory E. ;
Davis, Sean ;
Scacheri, Peter C. ;
Renaud, Gabriel ;
Halawi, Mohamad J. ;
Erdos, Michael R. ;
Green, Roland ;
Meltzer, Paul S. ;
Wolfsberg, Tyra G. ;
Collins, Francis S. .
NATURE METHODS, 2006, 3 (07) :503-509
[7]   Chromatin Signatures in Multipotent Human Hematopoietic Stem Cells Indicate the Fate of Bivalent Genes during Differentiation [J].
Cui, Kairong ;
Zang, Chongzhi ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Childs, Richard W. ;
Peng, Weiqun ;
Zhao, Keji .
CELL STEM CELL, 2009, 4 (01) :80-93
[8]   Integrating multiple evidence sources to predict transcription factor binding in the human genome [J].
Ernst, Jason ;
Plasterer, Heather L. ;
Simon, Itamar ;
Bar-Joseph, Ziv .
GENOME RESEARCH, 2010, 20 (04) :526-536
[9]   Finding regulatory DNA motifs using alignment-free evolutionary conservation information [J].
Gordan, Raluca ;
Narlikar, Leelavati ;
Hartemink, Alexander J. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (06) :e90.1-e90.12
[10]   FIMO: scanning for occurrences of a given motif [J].
Grant, Charles E. ;
Bailey, Timothy L. ;
Noble, William Stafford .
BIOINFORMATICS, 2011, 27 (07) :1017-1018