Identification of human gene core promoters in silico

被引:79
作者
Zhang, MQ [1 ]
机构
[1] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
来源
GENOME RESEARCH | 1998年 / 8卷 / 03期
关键词
D O I
10.1101/gr.8.3.319
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Identification of the 5'-end of human genes requires identification of functional promoter elements. In silico identification of those elements is difficult because of the hierarchical and modular nature of promoter architecture. To address this problem, I propose a new stepwise strategy based on initial localization of a functional promoter into a 1- to 2-kb (extended promoter) region from within a large genomic DNA sequence of 100 kb or larger and further localization of a transcriptional start site (TSS) into a 50- to 100-bp (corepromoter) region. Using positional dependent 5-tuple measures, a quadratic discriminant analysis (QDA) method has been implemented in a new program-CorePromoter. Our experiments indicate that when given a 1- to 2-kb extended promoter, CorePromoter will correctly localize the TSS to a 100-bp interval similar to 60% of the time.
引用
收藏
页码:319 / 326
页数:8
相关论文
共 23 条
[11]  
NOVINA CD, 1996, J MOL BIOL, V249, P923
[12]   The general transcription factors of RNA polymerase II [J].
Orphanides, G ;
Lagrange, T ;
Reinberg, D .
GENES & DEVELOPMENT, 1996, 10 (21) :2657-2683
[13]   PREDICTING POL-II PROMOTER SEQUENCES USING TRANSCRIPTION FACTOR-BINDING SITES [J].
PRESTRIDGE, DS .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 249 (05) :923-932
[14]   The role of general initiation factors in transcription by RNA polymerase II [J].
Roeder, RG .
TRENDS IN BIOCHEMICAL SCIENCES, 1996, 21 (09) :327-335
[15]   Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes [J].
Smale, ST .
BIOCHIMICA ET BIOPHYSICA ACTA-GENE STRUCTURE AND EXPRESSION, 1997, 1351 (1-2) :73-88
[16]  
SOLOVYEV V, 1997, P 5 INT C INT SYST M, P294
[17]   THE CALCULATION OF POSTERIOR DISTRIBUTIONS BY DATA AUGMENTATION [J].
TANNER, MA ;
WING, HW .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (398) :528-540
[18]   The biochemistry of transcription in eukaryotes: A paradigm for multisubunit regulatory complexes [J].
Tjian, R .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1996, 351 (1339) :491-499
[19]  
Zhang M Q, 1998, Pac Symp Biocomput, P240
[20]   Identification of protein coding regions in the human genome by quadratic discriminant analysis [J].
Zhang, MQ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (02) :565-568