A functional update of the Escherichia coli K-12 genome

被引:93
作者
Serres, Margrethe H. [1 ]
Gopal, Shuba [2 ]
Nahum, Laila A. [1 ]
Liang, Ping [1 ]
Gaasterland, Terry [2 ]
Riley, Monica [1 ]
机构
[1] Josephine Bay Paul Ctr Comparat Mol Biol & Evolut, Marine Biol Lab, Woods Hole, MA 02543 USA
[2] Rockefeller Univ, Lab Computat Genom, New York, NY 10021 USA
关键词
Additional Data File; Function Assignment; Function Attribution; Homoserine Dehydrogenase; Independent Evolutionary History;
D O I
10.1186/gb-2001-2-9-research0035
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Since the genome of Escherichia coli K-12 was initially annotated in 1997, additional functional information based on biological characterization and functions of sequence-similar proteins has become available. On the basis of this new information, an updated version of the annotated chromosome has been generated. Results: The E. coli K-12 chromosome is currently represented by 4,401 genes encoding 116 RNAs and 4,285 proteins. The boundaries of the genes identified in the GenBank Accession U00096 were used. Some protein-coding sequences are compound and encode multimodular proteins. The coding sequences (CDSs) are represented by modules (protein elements of at least 100 amino acids with biological activity and independent evolutionary history). There are 4,616 identified modules in the 4,285 proteins. Of these, 48.9% have been characterized, 29.5% have an imputed function, 2.1% have a phenotype and 19.5% have no function assignment. Only 7% of the modules appear unique to E. coli, and this number is expected to be reduced as more genome data becomes available. The imputed functions were assigned on the basis of manual evaluation of functions predicted by BLAST and DARWIN analyses and by the MAGPIE genome annotation system. Conclusions: Much knowledge has been gained about functions encoded by the E. coli K-12 genome since the 1997 annotation was published. The data presented here should be useful for analysis of E. coli gene products as well as gene products encoded by other genomes.
引用
收藏
页数:7
相关论文
共 28 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Short palindromic repetitive DNA elements in enterobacteria:: a survey [J].
Bachellier, S ;
Clément, JM ;
Hofnung, M .
RESEARCH IN MICROBIOLOGY, 1999, 150 (9-10) :627-639
[4]   PROSITE - A DICTIONARY OF SITES AND PATTERNS IN PROTEINS [J].
BAIROCH, A .
NUCLEIC ACIDS RESEARCH, 1992, 20 :2013-2018
[5]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[6]   Improved microbial gene identification with GLIMMER [J].
Delcher, AL ;
Harmon, D ;
Kasif, S ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 1999, 27 (23) :4636-4641
[7]   Fully automated genome analysis that reflects user needs and preferences. A detailed introduction to the MAGPIE system architecture [J].
Gaasterland, T ;
Sensen, CW .
BIOCHIMIE, 1996, 78 (05) :302-310
[8]   EXHAUSTIVE MATCHING OF THE ENTIRE PROTEIN-SEQUENCE DATABASE [J].
GONNET, GH ;
COHEN, MA ;
BENNER, SA .
SCIENCE, 1992, 256 (5062) :1443-1445
[9]  
JACKOWSKI S, 1994, J BIOL CHEM, V269, P2921
[10]   The EcoCyc and MetaCyc databases [J].
Karp, PD ;
Riley, M ;
Saier, M ;
Paulsen, IT ;
Paley, SM ;
Pellegrini-Toole, A .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :56-59