Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12

被引:75
作者
Thieffry, D [1 ]
Salgado, H [1 ]
Huerta, AM [1 ]
Collado-Vides, J [1 ]
机构
[1] Univ Nacl Autonoma Mexico, Ctr Invest Fijac Nitrogeno, Cuernavaca 62100, Morelos, Mexico
关键词
D O I
10.1093/bioinformatics/14.5.391
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: As one of the best-characterized free-living organisms, Escherichia coli and its recently completed genomic sequence offer a special opportunity to exploit systematically the variety of regulatory data available in the literature in ol der to make a comprehensive set of regulatory predictions in the whole genome. Results: The complete genome sequence of E. coli was analyzed for the binding of transcriptional regulators upstream of coding sequences. The biological information contained in RegulonDB (Huerta,A.M. et al., Nucleic Acids Res., 26, 55-60, 1998) for 56 different transcriptional proteins was the support to implement a stringent strategy combining string sear-ch and weight matrices. We estimate that our search included representatives of 15-25% of the total number of regulatory binding proteins in E.coli. This search was performed on the set of 4288 putative regulatory regions, each 450 bp long. Within the regions with predicted sites, 89% are regulated by one protein and 81% involve only one sire. These numbers are reasonably consistent with the distribution of experimental regulatory sites. Regulatory sites are found in 603 regions corresponding to 16% of operon regions and 10% of intra-operonic regions. Additional evidence gives stronger support to some of these predictions, including the position of the site, biological consistency with the function of the downstream gene, as well as genetic evidence for the regulatory interaction. The predictions described here were incorporated into the map presented in the paper describing the complete E.coli genome (Blattner;F.R.. er al., Science, 277, 1453-1461, 1997).
引用
收藏
页码:391 / 400
页数:10
相关论文
共 24 条
[1]   The SWISS-2DPAGE database of two-dimensional polyacrylamide gel electrophoresis, its status in 1995 [J].
Appel, RD ;
Sanchez, JC ;
Bairoch, A ;
Golaz, O ;
Ravier, F ;
Pasquali, C ;
Hughes, GJ ;
Hochstrasser, DF .
NUCLEIC ACIDS RESEARCH, 1996, 24 (01) :180-181
[2]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[3]   GLOBAL REGULATION OF GENE-EXPRESSION IN ESCHERICHIA-COLI [J].
CHUANG, SE ;
DANIELS, DL ;
BLATTNER, FR .
JOURNAL OF BACTERIOLOGY, 1993, 175 (07) :2026-2036
[4]   GRAMMATICAL MODEL OF THE REGULATION OF GENE-EXPRESSION [J].
COLLADOVIDES, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (20) :9405-9409
[5]   The yeast genome project: What did we learn? [J].
Dujon, B .
TRENDS IN GENETICS, 1996, 12 (07) :263-270
[6]   SEARCHING FOR AND PREDICTING THE ACTIVITY OF SITES FOR DNA-BINDING PROTEINS - COMPILATION AND ANALYSIS OF THE BINDING-SITES FOR ESCHERICHIA-COLI INTEGRATION HOST FACTOR (IHF) [J].
GOODRICH, JA ;
SCHWARTZ, ML ;
MCCLURE, WR .
NUCLEIC ACIDS RESEARCH, 1990, 18 (17) :4993-5000
[7]  
GRALLA JD, 1996, CELLULAR MOL BIOL ES, P1232
[8]  
HERTZ GZ, 1990, COMPUT APPL BIOSCI, V6, P81
[9]  
HERTZ GZ, 1995, BIOINFORMATICS GENOM, P201
[10]   RegulonDB:: a database on transcriptional regulation in Escherichia coli [J].
Huerta, AM ;
Salgado, H ;
Thieffry, D ;
Collado-Vides, J .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :55-59