Promoter features related to tissue specificity as measured by Shannon entropy

被引:346
作者
Schug, J [1 ]
Schuller, WP
Kappen, C
Salbaum, JM
Bucan, M
Stoeckert, CJ
机构
[1] Univ Penn, Ctr Bioinformat, Philadelphia, PA 19104 USA
[2] Univ Nebraska, Med Ctr, Dept Genet Cell Biol & Anat, Omaha, NE 68198 USA
[3] Univ Penn, Dept Genet, Philadelphia, PA 19104 USA
关键词
D O I
10.1186/gb-2005-6-4-r33
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The regulatory mechanisms underlying tissue specificity are a crucial part of the development and maintenance of multicellular organisms. A genome-wide analysis of promoters in the context of gene-expression patterns in tissue surveys provides a means of identifying the general principles for these mechanisms. Results: We introduce a definition of tissue specificity based on Shannon entropy to rank human genes according to their overall tissue specificity and by their specificity to particular tissues. We apply our definition to microarray-based and expressed sequence tag (EST)-based expression data for human genes and use similar data for mouse genes to validate our results. We show that most genes show statistically significant tissue-dependent variations in expression level. We find that the most tissue-specific genes typically have a TATA box, no CpG island, and often code for extracellular proteins. As expected, CpG islands are found in most of the least tissue-specific genes, which often code for proteins located in the nucleus or mitochondrion. The class of genes with no CpG island or TATA box are the most common mid-specificity genes and commonly code for proteins located in a membrane. Sp1 was found to be a weak indicator of less-specific expression. YY1 binding sites, either as initiators or as downstream sites, were strongly associated with the least-specific genes. Conclusions: We have begun to understand the components of promoters that distinguish tissue-specific from ubiquitous genes, to identify associations that can predict the broad class of gene expression from sequence data alone.
引用
收藏
页数:24
相关论文
共 66 条
[21]   The Gene Ontology (GO) database and informatics resource [J].
Harris, MA ;
Clark, J ;
Ireland, A ;
Lomax, J ;
Ashburner, M ;
Foulger, R ;
Eilbeck, K ;
Lewis, S ;
Marshall, B ;
Mungall, C ;
Richter, J ;
Rubin, GM ;
Blake, JA ;
Bult, C ;
Dolan, M ;
Drabkin, H ;
Eppig, JT ;
Hill, DP ;
Ni, L ;
Ringwald, M ;
Balakrishnan, R ;
Cherry, JM ;
Christie, KR ;
Costanzo, MC ;
Dwight, SS ;
Engel, S ;
Fisk, DG ;
Hirschman, JE ;
Hong, EL ;
Nash, RS ;
Sethuraman, A ;
Theesfeld, CL ;
Botstein, D ;
Dolinski, K ;
Feierbach, B ;
Berardini, T ;
Mundodi, S ;
Rhee, SY ;
Apweiler, R ;
Barrell, D ;
Camon, E ;
Dimmer, E ;
Lee, V ;
Chisholm, R ;
Gaudet, P ;
Kibbe, W ;
Kishore, R ;
Schwarz, EM ;
Sternberg, P ;
Gwinn, M .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D258-D261
[22]   RIKEN mouse genome encyclopedia [J].
Hayashizaki, Y .
MECHANISMS OF AGEING AND DEVELOPMENT, 2003, 124 (01) :93-102
[23]   Identifying biological themes within lists of genes with EASE [J].
Hosack, DA ;
Dennis, G ;
Sherman, BT ;
Lane, HC ;
Lempicki, RA .
GENOME BIOLOGY, 2003, 4 (10)
[24]   A compendium of gene expression in normal human tissues [J].
Hsiao, LL ;
Dangond, F ;
Yoshida, T ;
Hong, R ;
Jensen, RV ;
Misra, J ;
Dillon, W ;
Lee, KF ;
Clark, KE ;
Haverty, P ;
Weng, ZP ;
Mutter, GL ;
Frosch, MP ;
MacDonald, ME ;
Milford, EL ;
Crum, CP ;
Bueno, R ;
Pratt, RE ;
Mahadevappa, M ;
Warrington, JA ;
Stephanopoulos, G ;
Stephanopoulos, G ;
Gullans, SR .
PHYSIOLOGICAL GENOMICS, 2001, 7 (02) :97-104
[25]   Congruence of tissue expression profiles from Gene Expression Atlas, SAGEmap and TissueInfo databases [J].
Huminiecki, L ;
Lloyd, AT ;
Wolfe, KH .
BMC GENOMICS, 2003, 4 (1)
[26]   The UCSC Table Browser data retrieval tool [J].
Karolchik, D ;
Hinrichs, AS ;
Furey, TS ;
Roskin, KM ;
Sugnet, CW ;
Haussler, D ;
Kent, WJ .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D493-D496
[27]   The human genome browser at UCSC [J].
Kent, WJ ;
Sugnet, CW ;
Furey, TS ;
Roskin, KM ;
Pringle, TH ;
Zahler, AM ;
Haussler, D .
GENOME RESEARCH, 2002, 12 (06) :996-1006
[28]   A predictive model for regulatory sequences directing liver-specific transcription [J].
Krivan, W ;
Wasserman, WW .
GENOME RESEARCH, 2001, 11 (09) :1559-1566
[29]   EVIDENCE FOR PHYSICAL INTERACTION BETWEEN THE ZINC-FINGER TRANSCRIPTION FACTORS YY1 AND SP1 [J].
LEE, JS ;
GALVIN, KM ;
SHI, Y .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (13) :6145-6149
[30]   Gene regulation by Sp1 and Sp3 [J].
Li, L ;
He, SH ;
Sun, JM ;
Davie, JR .
BIOCHEMISTRY AND CELL BIOLOGY, 2004, 82 (04) :460-471