Integrating Diverse Datasets Improves Developmental Enhancer Prediction

被引:148
作者
Erwin, Genevieve D. [1 ,2 ]
Oksenberg, Nir [2 ,3 ]
Truty, Rebecca M. [1 ]
Kostka, Dennis [4 ,5 ]
Murphy, Karl K. [2 ,3 ]
Ahituv, Nadav [2 ,3 ]
Pollard, Katherine S. [1 ,2 ,6 ]
Capra, John A. [7 ,8 ]
机构
[1] Gladstone Inst Cardiovasc Dis, San Francisco, CA 94158 USA
[2] Univ Calif San Francisco, Inst Human Genet, San Francisco, CA 94143 USA
[3] Univ Calif San Francisco, Dept Bioengn & Therapeut Sci, San Francisco, CA 94143 USA
[4] Univ Pittsburgh, Dept Dev Biol, Pittsburgh, PA USA
[5] Univ Pittsburgh, Dept Computat & Syst Biol, Pittsburgh, PA USA
[6] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco, CA 94143 USA
[7] Vanderbilt Univ, Ctr Human Genet Res, Nashville, TN 37235 USA
[8] Vanderbilt Univ, Dept Biomed Informat, Nashville, TN 37235 USA
关键词
TRANSCRIPTION FACTOR-BINDING; GENOME-WIDE DISCOVERY; CHROMATIN STATE; CHIP-SEQ; HISTONE MODIFICATIONS; GENE-EXPRESSION; SIGNATURES; SEQUENCE; ELEMENTS; FOXC1;
D O I
10.1371/journal.pcbi.1003677
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Gene-regulatory enhancers have been identified using various approaches, including evolutionary conservation, regulatory protein binding, chromatin modifications, and DNA sequence motifs. To integrate these different approaches, we developed EnhancerFinder, a two-step method for distinguishing developmental enhancers from the genomic background and then predicting their tissue specificity. EnhancerFinder uses a multiple kernel learning approach to integrate DNA sequence motifs, evolutionary patterns, and diverse functional genomics datasets from a variety of cell types. In contrast with prediction approaches that define enhancers based on histone marks or p300 sites from a single cell line, we trained EnhancerFinder on hundreds of experimentally verified human developmental enhancers from the VISTA Enhancer Browser. We comprehensively evaluated EnhancerFinder using cross validation and found that our integrative method improves the identification of enhancers over approaches that consider a single type of data, such as sequence motifs, evolutionary conservation, or the binding of enhancer-associated proteins. We find that VISTA enhancers active in embryonic heart are easier to identify than enhancers active in several other embryonic tissues, likely due to their uniquely high GC content. We applied EnhancerFinder to the entire human genome and predicted 84,301 developmental enhancers and their tissue specificity. These predictions provide specific functional annotations for large amounts of human non-coding DNA, and are significantly enriched near genes with annotated roles in their predicted tissues and lead SNPs from genome-wide association studies. We demonstrate the utility of EnhancerFinder predictions through in vivo validation of novel embryonic gene regulatory enhancers from three developmental transcription factor loci. Our genome-wide developmental enhancer predictions are freely available as a UCSC Genome Browser track, which we hope will enable researchers to further investigate questions in developmental biology.
引用
收藏
页数:20
相关论文
共 105 条
[1]
Ahituv N, 2012, GENE REGULATORY SEQU, Vx, DOI [10.1007/978-1-4614-1683-8, DOI 10.1007/978-1-4614-1683-8]
[2]
FOXC1 is required for normal cerebellar development and is a major contributor to chromosome 6p25.3 Dandy-Walker malformation [J].
Aldinger, Kimberly A. ;
Lehmann, Ordan J. ;
Hudgins, Louanne ;
Chizhikov, Victor V. ;
Bassuk, Alexander G. ;
Ades, Lesley C. ;
Krantz, Ian D. ;
Dobyns, William B. ;
Millen, Kathleen J. .
NATURE GENETICS, 2009, 41 (09) :1037-U116
[3]
An atlas of active enhancers across human cell types and tissues [J].
Andersson, Robin ;
Gebhard, Claudia ;
Miguel-Escalada, Irene ;
Hoof, Ilka ;
Bornholdt, Jette ;
Boyd, Mette ;
Chen, Yun ;
Zhao, Xiaobei ;
Schmidl, Christian ;
Suzuki, Takahiro ;
Ntini, Evgenia ;
Arner, Erik ;
Valen, Eivind ;
Li, Kang ;
Schwarzfischer, Lucia ;
Glatz, Dagmar ;
Raithel, Johanna ;
Lilje, Berit ;
Rapin, Nicolas ;
Bagger, Frederik Otzen ;
Jorgensen, Mette ;
Andersen, Peter Refsing ;
Bertin, Nicolas ;
Rackham, Owen ;
Burroughs, A. Maxwell ;
Baillie, J. Kenneth ;
Ishizu, Yuri ;
Shimizu, Yuri ;
Furuhata, Erina ;
Maeda, Shiori ;
Negishi, Yutaka ;
Mungall, Christopher J. ;
Meehan, Terrence F. ;
Lassmann, Timo ;
Itoh, Masayoshi ;
Kawaji, Hideya ;
Kondo, Naoto ;
Kawai, Jun ;
Lennartsson, Andreas ;
Daub, Carsten O. ;
Heutink, Peter ;
Hume, David A. ;
Jensen, Torben Heick ;
Suzuki, Harukazu ;
Hayashizaki, Yoshihide ;
Mueller, Ferenc ;
Forrest, Alistair R. R. ;
Carninci, Piero ;
Rehli, Michael ;
Sandelin, Albin .
NATURE, 2014, 507 (7493) :455-+
[4]
Sequence and chromatin determinants of cell-type-specific transcription factor binding [J].
Arvey, Aaron ;
Agius, Phaedra ;
Noble, William Stafford ;
Leslie, Christina .
GENOME RESEARCH, 2012, 22 (09) :1723-1734
[5]
EXPRESSION OF A BETA-GLOBIN GENE IS ENHANCED BY REMOTE SV40 DNA-SEQUENCES [J].
BANERJI, J ;
RUSCONI, S ;
SCHAFFNER, W .
CELL, 1981, 27 (02) :299-308
[6]
High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[7]
Ben-Hur A, 2010, METHODS MOL BIOL, V609, P223, DOI 10.1007/978-1-60327-241-4_13
[8]
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[9]
ChIP-Seq identification of weakly conserved heart enhancers [J].
Blow, Matthew J. ;
McCulley, David J. ;
Li, Zirong ;
Zhang, Tao ;
Akiyama, Jennifer A. ;
Holt, Amy ;
Plajzer-Frick, Ingrid ;
Shoukry, Malak ;
Wright, Crystal ;
Chen, Feng ;
Afzal, Veena ;
Bristow, James ;
Ren, Bing ;
Black, Brian L. ;
Rubin, Edward M. ;
Visel, Axel ;
Pennacchio, Len A. .
NATURE GENETICS, 2010, 42 (09) :806-U107
[10]
Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development [J].
Bonn, Stefan ;
Zinzen, Robert P. ;
Girardot, Charles ;
Gustafson, E. Hilary ;
Perez-Gonzalez, Alexis ;
Delhomme, Nicolas ;
Ghavi-Helm, Yad ;
Wilczynski, Bartek ;
Riddell, Andrew ;
Furlong, Eileen E. M. .
NATURE GENETICS, 2012, 44 (02) :148-156