Computer model for recognition of functional transcription start sites in RNA polymerase II promoters of vertebrates

被引:43
作者
Bajic, VB [1 ]
Seah, SH [1 ]
Chong, A [1 ]
Krishnan, SPT [1 ]
Koh, JLY [1 ]
Brusic, V [1 ]
机构
[1] BIC, Labs Informat Technol, Computat Immunol Grp, Singapore 119613, Singapore
关键词
promoter modelling; promoter recognition; transcription start site; eukaryotic promoters;
D O I
10.1016/S1093-3263(02)00179-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This paper introduces a new computer system for recognition of functional transcription start sites (TSSs) in RNA polymerase II promoter regions of vertebrates. This system allows scanning complete vertebrate genomes for promoters with significantly reduced number of false positive predictions. It can be used in the context of gene finding through its recognition of the 5' end of genes. The implemented recognition model uses a composite-hierarchical approach, artificial intelligence, statistics, and signal processing techniques. It also exploits the separation of promoter sequences into those that are C + G-rich or C + G-poor. The system was evaluated on a large and diverse human sequence-set and exhibited several times higher accuracy than several publicly available TSS-finding programs. Results obtained using human chromosome 22 data showed even greater specificity than the evaluation set results. The system has been implemented in the Dragon Promoter Finder package, which can be accessed at http://sdmc.krdl.org.sg:8080/promoter/. (C) 2002 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:323 / 332
页数:10
相关论文
共 35 条
  • [1] Bajic V B, 2000, Brief Bioinform, V1, P214, DOI 10.1093/bib/1.3.214
  • [2] GenBank
    Benson, DA
    Karsch-Mizrachi, I
    Lipman, DJ
    Ostell, J
    Rapp, BA
    Wheeler, DL
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 15 - 18
  • [3] NONMETHYLATED CPG-RICH ISLANDS AT THE HUMAN ALPHA-GLOBIN LOCUS - IMPLICATIONS FOR EVOLUTION OF THE ALPHA-GLOBIN PSEUDOGENE
    BIRD, AP
    TAGGART, MH
    NICHOLLS, RD
    HIGGS, DR
    [J]. EMBO JOURNAL, 1987, 6 (04) : 999 - 1004
  • [4] PREDICTION OF HUMAN MESSENGER-RNA DONOR AND ACCEPTOR SITES FROM THE DNA-SEQUENCE
    BRUNAK, S
    ENGELBRECHT, J
    KNUDSEN, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1991, 220 (01) : 49 - 65
  • [5] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [6] CPG ISLANDS AND GENES
    CROSS, SH
    BIRD, AP
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 1995, 5 (03) : 309 - 314
  • [7] Insights into the molecular recognition of the 5′-GNN-3′ family of DNA sequences by zinc finger domains
    Dreier, B
    Segal, DJ
    Barbas, CF
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2000, 303 (04) : 489 - 502
  • [8] DUNHAM, UNPUB
  • [9] DETERMINATION OF EUKARYOTIC PROTEIN CODING REGIONS USING NEURAL NETWORKS AND INFORMATION-THEORY
    FARBER, R
    LAPEDES, A
    SIROTKIN, K
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1992, 226 (02) : 471 - 479
  • [10] Eukaryotic promoter recognition
    Fickett, JW
    Hatzigeorgiou, AC
    [J]. GENOME RESEARCH, 1997, 7 (09) : 861 - 878