Clustering of DNA sequences in human promoters

被引:176
作者
FitzGerald, PC
Shlyakhtenko, A
Mir, AA
Vinson, C [1 ]
机构
[1] NCI, Genome Anal Unit, NIH, Bethesda, MD 20892 USA
[2] NCI, Lab Metab, NIH, Bethesda, MD 20892 USA
关键词
D O I
10.1101/gr.1953904
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We have determined the distribution of each of the 65,536 DNA sequences that are eight bases long (8-mer) in a set of 13,010 human genomic promoter sequences aligned relative to the putative transcription start site (TSS). A limited number of 8-mers have peaks in their distribution (cluster), and most cluster within 100 by of the TSS. The 156 DNA sequences exhibiting the greatest statistically significant clustering near the TSS can be placed into nine groups of related sequences. Each group is defined by a consensus sequence, and seven of these consensus sequences are known binding sites for the transcription factors (TFs) SPI, NF-Y, ETS, CREB, TBP, USF, and NRF-1. One sequence, which we named Clusl, is not a known TF binding site. The ninth sequence group is composed of the strand-specific Kozak sequence that clusters downstream of the TSS. An examination of the co-occurrence of these TF consensus sequences indicates a positive correlation for most of them except for sequences bound by TBP (the TATA box). Human mRNA expression data from 29 tissues indicate that the ETS, NRF-1, and Clusl sequences that cluster are predominantly found in the promoters of housekeeping genes (e.g., ribosomal genes). In contrast, TATA is more abundant in the promoters of tissue-specific genes. This analysis identified eight DNA sequences in 5082 promoters that we suggest are important for regulating gene expression.
引用
收藏
页码:1562 / 1574
页数:13
相关论文
共 40 条
[21]   The molecular biology of the CCAAT-binding factor NF-Y [J].
Mantovani, R .
GENE, 1999, 239 (01) :15-27
[22]   TRANSFAC®:: transcriptional regulation, from patterns to profiles [J].
Matys, V ;
Fricke, E ;
Geffers, R ;
Gössling, E ;
Haubrock, M ;
Hehl, R ;
Hornischer, K ;
Karas, D ;
Kel, AE ;
Kel-Margoulis, OV ;
Kloos, DU ;
Land, S ;
Lewicki-Potapov, B ;
Michael, H ;
Münch, R ;
Reuter, I ;
Rotert, S ;
Saxel, H ;
Scheer, M ;
Thiele, S ;
Wingender, E .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :374-378
[23]   Transcriptional regulation by the phosphorylation-dependent factor CREB [J].
Mayr, B ;
Montminy, M .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2001, 2 (08) :599-609
[24]   Magnesium is required for specific DNA binding of the CREB B-ZIP domain [J].
Moll, JR ;
Acharya, A ;
Gal, J ;
Mir, AA ;
Vinson, C .
NUCLEIC ACIDS RESEARCH, 2002, 30 (05) :1240-1246
[25]   RefSeq and LocusLink: NCBI gene-centered resources [J].
Pruitt, KD ;
Maglott, DR .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :137-140
[26]   Introducing RefSeq and LocusLink: curated human genome resources at the NCBI [J].
Pruitt, KD ;
Katz, KS ;
Sicotte, H ;
Maglott, DR .
TRENDS IN GENETICS, 2000, 16 (01) :44-47
[27]   EMBOSS: The European molecular biology open software suite [J].
Rice, P ;
Longden, I ;
Bleasby, A .
TRENDS IN GENETICS, 2000, 16 (06) :276-277
[28]   The NF-YB/NF-YC structure gives insight into DNA binding and transcription regulation by CCAAT factor NF-Y [J].
Romier, C ;
Cocchiarella, F ;
Mantovani, R ;
Moras, D .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2003, 278 (02) :1336-1345
[29]   INTERACTION OF A GENE-SPECIFIC TRANSCRIPTION FACTOR WITH THE ADENOVIRUS MAJOR LATE PROMOTER UPSTREAM OF THE TATA BOX REGION [J].
SAWADOGO, M ;
ROEDER, RG .
CELL, 1985, 43 (01) :165-175
[30]   Transcriptional activators and coactivators in the nuclear control of mitochondrial function in mammalian cells [J].
Scarpulla, RC .
GENE, 2002, 286 (01) :81-89