Integrative annotation of chromatin elements from ENCODE data

被引:352
作者
Hoffman, Michael M. [1 ]
Ernst, Jason [2 ,3 ]
Wilder, Steven P. [4 ]
Kundaje, Anshul [5 ]
Harris, Robert S. [6 ]
Libbrecht, Max [1 ,7 ]
Giardine, Belinda [6 ]
Ellenbogen, Paul M. [1 ,7 ]
Bilmes, Jeffrey A. [8 ]
Birney, Ewan [4 ]
Hardison, Ross C. [6 ]
Dunham, Ian [4 ]
Kellis, Manolis [2 ,3 ]
Noble, William Stafford [1 ,7 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Broad Inst MIT & Harvard, Cambridge Ctr 7, Cambridge, MA 02142 USA
[4] EMBL European Bioinformat Inst, Cambridge CB10 1SD, England
[5] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[6] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA
[7] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
[8] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
GAMMA-GLOBIN GENE; HEREDITARY PERSISTENCE; FETAL-HEMOGLOBIN; TRANSCRIPTION; SUSCEPTIBILITY; DISCOVERY; ASSOCIATION; BREAKPOINT; EXPRESSION; VERTEBRATE;
D O I
10.1093/nar/gks1284
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate an interpretable summary of the massive datasets of the ENCODE Project, we apply unsupervised learning methodologies, converting dozens of chromatin datasets into discrete annotation maps of regulatory regions and other chromatin elements across the human genome. These methods rediscover and summarize diverse aspects of chromatin architecture, elucidate the interplay between chromatin activity and RNA transcription, and reveal that a large proportion of the genome lies in a quiescent state, even across multiple cell types. The resulting annotation of non-coding regulatory elements correlate strongly with mammalian evolutionary constraint, and provide an unbiased approach for evaluating metrics of evolutionary constraint in human. Lastly, we use the regulatory annotations to revisit previously uncharacterized disease-associated loci, resulting in focused, testable hypotheses through the lens of the chromatin landscape.
引用
收藏
页码:827 / 841
页数:15
相关论文
共 57 条
[1]   Toward a gold standard for promoter prediction evaluation [J].
Abeel, Thomas ;
Van de Peer, Yves ;
Saeys, Yvan .
BIOINFORMATICS, 2009, 25 (12) :I313-I320
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]   SEQUENCES LOCATED 3' TO THE BREAKPOINT OF THE HEREDITARY PERSISTENCE OF FETAL HEMOGLOBIN-3 DELETION EXHIBIT ENHANCER ACTIVITY AND CAN MODIFY THE DEVELOPMENTAL EXPRESSION OF THE HUMAN FETAL A-GAMMA-GLOBIN GENE IN TRANSGENIC MICE [J].
ANAGNOU, NP ;
PEREZSTABLE, C ;
GELINAS, R ;
COSTANTINI, F ;
LIAPAKI, K ;
CONSTANTOPOULOU, M ;
KOSTEAS, T ;
MOSCHONAS, NK ;
STAMATOYANNOPOULOS, G .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1995, 270 (17) :10256-10263
[4]   The protein CTCF is required for the enhancer blocking activity of vertebrate insulators [J].
Bell, AC ;
West, AG ;
Felsenfeld, G .
CELL, 1999, 98 (03) :387-396
[5]   A bivalent chromatin structure marks key developmental genes in embryonic stem cells [J].
Bernstein, BE ;
Mikkelsen, TS ;
Xie, XH ;
Kamal, M ;
Huebert, DJ ;
Cuff, J ;
Fry, B ;
Meissner, A ;
Wernig, M ;
Plath, K ;
Jaenisch, R ;
Wagschal, A ;
Feil, R ;
Schreiber, SL ;
Lander, ES .
CELL, 2006, 125 (02) :315-326
[6]   GeneWise and genomewise [J].
Birney, E ;
Clamp, M ;
Durbin, R .
GENOME RESEARCH, 2004, 14 (05) :988-995
[7]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[8]   Exploratory analysis of genomic segmentations with Segtools [J].
Buske, Orion J. ;
Hoffman, Michael M. ;
Ponts, Nadia ;
Le Roch, Karine G. ;
Noble, William Stafford .
BMC BIOINFORMATICS, 2011, 12
[9]   Distribution and intensity of constraint in mammalian genomic sequence [J].
Cooper, GM ;
Stone, EA ;
Asimenos, G ;
Green, ED ;
Batzoglou, S ;
Sidow, A .
GENOME RESEARCH, 2005, 15 (07) :901-913
[10]   Landscape of transcription in human cells [J].
Djebali, Sarah ;
Davis, Carrie A. ;
Merkel, Angelika ;
Dobin, Alex ;
Lassmann, Timo ;
Mortazavi, Ali ;
Tanzer, Andrea ;
Lagarde, Julien ;
Lin, Wei ;
Schlesinger, Felix ;
Xue, Chenghai ;
Marinov, Georgi K. ;
Khatun, Jainab ;
Williams, Brian A. ;
Zaleski, Chris ;
Rozowsky, Joel ;
Roeder, Maik ;
Kokocinski, Felix ;
Abdelhamid, Rehab F. ;
Alioto, Tyler ;
Antoshechkin, Igor ;
Baer, Michael T. ;
Bar, Nadav S. ;
Batut, Philippe ;
Bell, Kimberly ;
Bell, Ian ;
Chakrabortty, Sudipto ;
Chen, Xian ;
Chrast, Jacqueline ;
Curado, Joao ;
Derrien, Thomas ;
Drenkow, Jorg ;
Dumais, Erica ;
Dumais, Jacqueline ;
Duttagupta, Radha ;
Falconnet, Emilie ;
Fastuca, Meagan ;
Fejes-Toth, Kata ;
Ferreira, Pedro ;
Foissac, Sylvain ;
Fullwood, Melissa J. ;
Gao, Hui ;
Gonzalez, David ;
Gordon, Assaf ;
Gunawardena, Harsha ;
Howald, Cedric ;
Jha, Sonali ;
Johnson, Rory ;
Kapranov, Philipp ;
King, Brandon .
NATURE, 2012, 489 (7414) :101-108