A comparative genomics approach to prediction of new members of regulons

被引:104
作者
Tan, K
Moreno-Hagelsieb, G
Collado-Vides, J
Stormo, GD [1 ]
机构
[1] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
[2] Univ Nacl Autonoma Mexico, Ctr Invest Fijac Nitrogeno, Programa Biol Mol Computac, Cuernavaca 62100, Morelos, Mexico
关键词
D O I
10.1101/gr.149301
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Identifying the complete transcriptional regulatory network for an organism is a major challenge. For each regulatory protein, we want to know all the genes it regulates, that is, its regulon. Examples of known binding sites can be used to estimate the binding specificity of the protein and to predict other binding sites. However, binding site predictions can be unreliable because determining the true specificity of the protein is difficult because of the considerable variability of binding sites. Because regulatory systems tend to be conserved through evolution, we can use comparisons between species to increase the reliability of binding site predictions. In this article, an approach is presented to evaluate the computational predicitions of regulatory sites. We combine the prediction of transcription units having orthologous genes with the prediction of transcription factor binding sites based on probabilistic models. We augment the sets of genes in Escherichia coli that are expected to be regulated by two transcription factors, the cAMP receptor. protein and the fumarate and nitrate reduction regulatory protein, through a comparison with the Haemophilus influenzae genome. At the same time, we learned more about the regulatory networks of H. influenzae, a species with much less experimental knowledge than E. coll. By studying orthologous genes subject to regulation by the same transcription factor, we also gained understanding of the evolution of the entire regulatory systems.
引用
收藏
页码:566 / 584
页数:19
相关论文
共 46 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] GenBank
    Benson, DA
    Boguski, MS
    Lipman, DJ
    Ostell, J
    Ouellette, BFF
    Rapp, BA
    Wheeler, DL
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 12 - 17
  • [3] The complete genome sequence of Escherichia coli K-12
    Blattner, FR
    Plunkett, G
    Bloch, CA
    Perna, NT
    Burland, V
    Riley, M
    ColladoVides, J
    Glasner, JD
    Rode, CK
    Mayhew, GF
    Gregor, J
    Davis, NW
    Kirkpatrick, HA
    Goeden, MA
    Rose, DJ
    Mau, B
    Shao, Y
    [J]. SCIENCE, 1997, 277 (5331) : 1453 - +
  • [4] Craven M, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P116
  • [5] Conservation of gene order: a fingerprint of proteins that physically interact
    Dandekar, T
    Snel, B
    Huynen, M
    Bork, P
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) : 324 - 328
  • [6] CONSENSUS DNA SITE FOR THE ESCHERICHIA-COLI CATABOLITE GENE ACTIVATOR PROTEIN (CAP) - CAP EXHIBITS A 450-FOLD HIGHER AFFINITY FOR THE CONSENSUS DNA SITE THAN FOR THE ESCHERICHIA-COLI LAC DNA SITE
    EBRIGHT, RH
    EBRIGHT, YW
    GUNASEKERA, A
    [J]. NUCLEIC ACIDS RESEARCH, 1989, 17 (24) : 10295 - 10305
  • [7] DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS
    FITCH, WM
    [J]. SYSTEMATIC ZOOLOGY, 1970, 19 (02): : 99 - &
  • [8] WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD
    FLEISCHMANN, RD
    ADAMS, MD
    WHITE, O
    CLAYTON, RA
    KIRKNESS, EF
    KERLAVAGE, AR
    BULT, CJ
    TOMB, JF
    DOUGHERTY, BA
    MERRICK, JM
    MCKENNEY, K
    SUTTON, G
    FITZHUGH, W
    FIELDS, C
    GOCAYNE, JD
    SCOTT, J
    SHIRLEY, R
    LIU, LI
    GLODEK, A
    KELLEY, JM
    WEIDMAN, JF
    PHILLIPS, CA
    SPRIGGS, T
    HEDBLOM, E
    COTTON, MD
    UTTERBACK, TR
    HANNA, MC
    NGUYEN, DT
    SAUDEK, DM
    BRANDON, RC
    FINE, LD
    FRITCHMAN, JL
    FUHRMANN, JL
    GEOGHAGEN, NSM
    GNEHM, CL
    MCDONALD, LA
    SMALL, KV
    FRASER, CM
    SMITH, HO
    VENTER, JC
    [J]. SCIENCE, 1995, 269 (5223) : 496 - 512
  • [9] Gelfand M S, 1995, J Comput Biol, V2, P87, DOI 10.1089/cmb.1995.2.87
  • [10] Prediction of transcription regulatory sites in Archaea by a comparative genomic approach
    Gelfand, MS
    Koonin, EV
    Mironov, AA
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (03) : 695 - 705