Predicting tissue-specific enhancers in the human genome

被引:108
作者
Pennacchio, Len A.
Loots, Gabriela G.
Nobrega, Marcelo A.
Ovcharenko, Ivan [1 ]
机构
[1] Joint Genome Inst, US Dept Energy, Walnut Creek, CA 94598 USA
[2] Lawrence Berkeley Natl Lab, Div Genom, Berkeley, CA 94720 USA
[3] Lawrence Livermore Natl Lab, Div Biosci & Biotechnol, Livermore, CA 94550 USA
[4] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[5] Lawrence Livermore Natl Lab, Computat Directorate, Livermore, CA 94550 USA
关键词
D O I
10.1101/gr.5972507
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Determining how transcriptional regulatory signals are encoded in vertebrate genomes is essential for understanding the origins of multicellular complexity; yet the genetic code of vertebrate gene regulation remains poorly understood. In an attempt to elucidate this code, we synergistically combined genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to define sequence signatures characteristic of candidate tissue-specific enhancers in the human genome. We applied this strategy to microarray-based gene expression profiles from 79 human tissues and identified 7187 candidate enhancers that defined their flanking gene expression, the majority of which were located outside of known promoters. We cross-validated this method for its ability to de novo predict tissue-specific gene expression and confirmed its reliability in 57 of the 79 available human tissues, with an average precision in enhancer recognition ranging from 32% to 63% and a sensitivity of 47%. We used the sequence signatures identified by this approach to successfully assign tissue-specific predictions to similar to 328,000 human-mouse conserved noncoding elements in the human genome. By overlapping these genome-wide predictions with a data set of enhancers validated in vivo, in transgenic mice, we were able to confirm our results with a 28% sensitivity and 50% precision. These results indicate the power of combining complementary genomic data sets as an initial computational foray into a global view of tissue-specific gene regulation in vertebrates.
引用
收藏
页码:201 / 211
页数:11
相关论文
共 50 条
[1]   Promoter prediction analysis on the whole human genome [J].
Bajic, VB ;
Tan, SL ;
Suzuki, Y ;
Sugano, S .
NATURE BIOTECHNOLOGY, 2004, 22 (11) :1467-1473
[2]   Matlnspector and beyond: promoter analysis based on transcription factor binding sites [J].
Cartharius, K ;
Frech, K ;
Grote, K ;
Klocke, B ;
Haltmeier, M ;
Klingenhoff, A ;
Frisch, M ;
Bayerlein, M ;
Werner, T .
BIOINFORMATICS, 2005, 21 (13) :2933-2942
[3]   Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs [J].
Cawley, S ;
Bekiranov, S ;
Ng, HH ;
Kapranov, P ;
Sekinger, EA ;
Kampa, D ;
Piccolboni, A ;
Sementchenko, V ;
Cheng, J ;
Williams, AJ ;
Wheeler, R ;
Wong, B ;
Drenkow, J ;
Yamanaka, M ;
Patel, S ;
Brubaker, S ;
Tammana, H ;
Helt, G ;
Struhl, K ;
Gingeras, TR .
CELL, 2004, 116 (04) :499-509
[4]   A novel function of transcription factor α-Pal/NRF-1:: Increasing neurite outgrowth [J].
Chang, WT ;
Chen, H ;
Chiou, RJ ;
Chen, CY ;
Huang, AM .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2005, 334 (01) :199-206
[5]   HNF factors form a network to regulate liver-enriched genes in zebrafish [J].
Cheng, Wei ;
Guo, Lin ;
Zhang, Zhenhai ;
Soo, Hui Meng ;
Wen, Chaoming ;
Wu, Wei ;
Peng, Jinrong .
DEVELOPMENTAL BIOLOGY, 2006, 294 (02) :482-496
[6]   REGULATION OF THE HUMAN CARDIAC SLOW-TWITCH TROPONIN-C GENE BY MULTIPLE, COOPERATIVE, CELL-TYPE-SPECIFIC, AND MYOD-RESPONSIVE ELEMENTS [J].
CHRISTENSEN, TH ;
PRENTICE, H ;
GAHLMANN, R ;
KEDES, L .
MOLECULAR AND CELLULAR BIOLOGY, 1993, 13 (11) :6752-6765
[7]   Adaptively inferring human transcriptional subnetworks [J].
Das, Debopriya ;
Nahle, Zaher ;
Zhang, Michael Q. .
MOLECULAR SYSTEMS BIOLOGY, 2006, 2 (1) :14P
[8]   A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts [J].
de la Calle-Mustienes, E ;
Feijóo, CG ;
Manzanares, M ;
Tena, JJ ;
Rodríguez-Seguel, E ;
Letizia, A ;
Allende, ML ;
Gómez-Skarmeta, JL .
GENOME RESEARCH, 2005, 15 (08) :1061-1072
[9]   Conserved non-genic sequences - an unexpected feature of mammalian genomes [J].
Dermitzakis, ET ;
Reymond, A ;
Antonarakis, SE .
NATURE REVIEWS GENETICS, 2005, 6 (02) :151-157
[10]   Noncoding sequences conserved in a limited number of mammals in the SIM2 interval are frequently functional [J].
Frazer, KA ;
Tao, H ;
Osoegawa, K ;
de Jong, PJ ;
Chen, XY ;
Doherty, MF ;
Cox, DR .
GENOME RESEARCH, 2004, 14 (03) :367-372