Genome-wide analysis of core promoter elements from conserved human and mouse orthologous pairs

被引:62
作者
Jin, Victor X. [1 ]
Singer, Gregory A. C. [1 ]
Agosto-Perez, Francisco J. [1 ]
Liyanarachchi, Sandya [1 ]
Davuluri, Ramana V. [1 ]
机构
[1] Ohio State Univ, Dept Mol Virol Immunol & Med Genet, Comprehens Canc Ctr, Human Canc Genet Program, Columbus, OH 43210 USA
关键词
D O I
10.1186/1471-2105-7-114
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The canonical core promoter elements consist of the TATA box, initiator (Inr), downstream core promoter element (DPE), TFIIB recognition element (BRE) and the newly-discovered motif 10 element (MTE). The motifs for these core promoter elements are highly degenerate, which tends to lead to a high false discovery rate when attempting to detect them in promoter sequences. Results: In this study, we have performed the first analysis of these core promoter elements in orthologous mouse and human promoters with experimentally-supported transcription start sites. We have identified these various elements using a combination of positional weight matrices (PWMs) and the degree of conservation of orthologous mouse and human sequences - a procedure that significantly reduces the false positive rate of motif discovery. Our analysis of 9,010 orthologous mouse-human promoter pairs revealed two combinations of three-way synergistic effects, TATA-Inr-MTE and BRE-Inr-MTE. The former has previously been putatively identified in human, but the latter represents a novel synergistic relationship. Conclusion: Our results demonstrate that DNA sequence conservation can greatly improve the identification of functional core promoter elements in the human genome. The data also underscores the importance of synergistic occurrence of two or more core promoter elements. Furthermore, the sequence data and results presented here can help build better computational models for predicting the transcription start sites in the promoter regions, which remains one of the most challenging problems.
引用
收藏
页数:13
相关论文
共 46 条
[1]   Score distributions for simultaneous matching to multiple motifs [J].
Bailey, TL ;
Gribskov, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1997, 4 (01) :45-59
[2]   Promoter prediction analysis on the whole human genome [J].
Bajic, VB ;
Tan, SL ;
Suzuki, Y ;
Sugano, S .
NATURE BIOTECHNOLOGY, 2004, 22 (11) :1467-1473
[3]  
BAJIC VB, 2003, SILICIO BIOL, V4, P11
[4]   Identification and distinct regulation of yeast TATA box-containing genes [J].
Basehoar, AD ;
Zanton, SJ ;
Pugh, BF .
CELL, 2004, 116 (05) :699-709
[5]   GenBank: update [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D23-D26
[6]   WEIGHT MATRIX DESCRIPTIONS OF 4 EUKARYOTIC RNA POLYMERASE-II PROMOTER ELEMENTS DERIVED FROM 502 UNRELATED PROMOTER SEQUENCES [J].
BUCHER, P .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 212 (04) :563-578
[7]   The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAF(II)60 of Drosophila [J].
Burke, TW ;
Kadonaga, JT .
GENES & DEVELOPMENT, 1997, 11 (22) :3020-3031
[8]   Drosophila TFIID binds to a conserved downstream basal promoter element that is present in many TATA-box-deficient promoters [J].
Burke, TW ;
Kadonaga, JT .
GENES & DEVELOPMENT, 1996, 10 (06) :711-724
[9]   The RNA polymerase II core promoter: a key component in the regulation of gene expression [J].
Butler, JEF ;
Kadonaga, JT .
GENES & DEVELOPMENT, 2002, 16 (20) :2583-2592
[10]   PROMOTER SEQUENCES OF EUKARYOTIC PROTEIN-CODING GENES [J].
CORDEN, J ;
WASYLYK, B ;
BUCHWALDER, A ;
CORSI, PS ;
KEDINGER, C ;
CHAMBON, P .
SCIENCE, 1980, 209 (4463) :1406-1414