Operon prediction using both genome-specific and general genomic information

被引:138
作者
Dam, Phuongan
Olman, Victor
Harris, Kyle
Su, Zhengchang
Xu, Ying [1 ]
机构
[1] Univ Georgia, Dept Biochem & Mol Biol, Computat Syst Biol Lab, Athens, GA 30602 USA
[2] Univ Georgia, Inst Bioinformat, Athens, GA 30602 USA
[3] Univ N Carolina, Dept Comp Sci, Ctr Bioinformat Res, Charlotte, NC 28223 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/gkl1018
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We have carried out a systematic analysis of the contribution of a set of selected features that include three new features to the accuracy of operon prediction. Our analyses have led to a number of new insights about operon prediction, including that (i) different features have different levels of discerning power when used on adjacent gene pairs with different ranges of intergenic distance, (ii) certain features are universally useful for operon prediction while others are more genome-specific and (iii) the prediction reliability of operons is dependent on intergenic distances. Based on these new insights, our newly developed operon-prediction program achieves more accurate operon prediction than the previous ones, and it uses features that are most readily available from genomic sequences. Our prediction results indicate that our (non-linear) decision tree-based classifier can predict operons in a prokaryotic genome very accurately when a substantial number of operons in the genome are already known. For example, the prediction accuracy of our program can reach 90.2 and 93.7% on Bacillus subtilis and Escherichia coli genomes, respectively. When no such information is available, our (linear) logistic function-based classifier can reach the prediction accuracy at 84.6 and 83.3% for E.coli and B.subtilis, respectively.
引用
收藏
页码:288 / 298
页数:11
相关论文
共 23 条
  • [1] RNA polymerases from Bacillus subtilis and Escherichia coli differ in recognition of regulatory signals in vitro
    Artsimovitch, I
    Svetlov, V
    Anthony, L
    Burgess, RR
    Landick, R
    [J]. JOURNAL OF BACTERIOLOGY, 2000, 182 (21) : 6027 - 6035
  • [2] A Bayesian network approach to operon prediction
    Bockhorst, J
    Craven, M
    Page, D
    Shavlik, J
    Glasner, J
    [J]. BIOINFORMATICS, 2003, 19 (10) : 1227 - 1235
  • [3] THE OPERON THAT ENCODES THE SIGMA-SUBUNIT OF RNA-POLYMERASE ALSO ENCODES RIBOSOMAL PROTEIN-S21 AND DNA PRIMASE IN ESCHERICHIA-COLI-K12
    BURTON, ZF
    GROSS, CA
    WATANABE, KK
    BURGESS, RR
    [J]. CELL, 1983, 32 (02) : 335 - 349
  • [4] Operon prediction by comparative genomics:: an application to the Synechococcus sp WH8102 genome
    Chen, X
    Su, Z
    Dam, P
    Palenik, B
    Xu, Y
    Jiang, T
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (07) : 2147 - 2157
  • [5] Chen Xin, 2004, Genome Inform, V15, P211
  • [6] Craven M, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P116
  • [7] DALGADO H, 2004, NUCLEIC ACIDS RES, V32, pD303
  • [8] De Hoon MJL, 2003, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2004, P276
  • [9] A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context
    Edwards, MT
    Rison, SCG
    Stoker, NG
    Wernisch, L
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 (10) : 3253 - 3262
  • [10] Prediction of operons in microbial genomes
    Ermolaeva, MD
    White, O
    Salzberg, SL
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (05) : 1216 - 1221