Physiological genomics of Escherichia coli protein families

被引:31
作者
Liang, P
Labedan, B
Riley, M
机构
[1] Marine Biol Lab, Josephine Bay Paul Ctr Comparat Mol Biol & Evolut, Woods Hole, MA 02543 USA
[2] Univ Paris 11, Inst Genet & Microbiol, CNRS, UMR 8621, F-91405 Orsay, France
关键词
module; sequence similarity; protein family; predicting protein function; annotation; evolution;
D O I
10.1152/physiolgenomics.00086.2001
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
The well-researched Escherichia coli genome offers the opportunity to explore the value of using protein families within a single organism to enrich functional annotation procedures and to study mechanisms of protein evolution. Having identified multimodular proteins resulting from gene fusion, and treated each module as a separate protein, nonoverlapping sequence-similar families in E. coli could be assembled. Of 3,902 proteins of length 100 residues or more, 2,415 clustered into 609 protein families. The relatedness of function among members of each family was dissected in detail. Data on paralogous protein families provides valuable information in attributing putative function to unknown genes, supplementing existing function annotation. Enzymes, transporters, and regulators represent the three major types of proteins in E. coli. They are shown to have distinctive patterns in gene duplication and divergence and gene fusion, suggesting that details of protein evolution have been different for genes in these categories. Data for the complete list of paralogous protein families and updated functional annotation for E. coli K-12 are accessible in GenProtEC.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 31 条
[1]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[2]   Linkage map of Escherichia coli K-12, edition 10: The traditional map [J].
Berlyn, MKB .
MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, 1998, 62 (03) :814-+
[3]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[4]   Phylogenetic classification and the universal tree [J].
Doolittle, WF .
SCIENCE, 1999, 284 (5423) :2124-2128
[5]  
Felsenstein J., 1989, CLADISTICS, V5, P164, DOI DOI 10.1111/J.1096-0031.1989.TB00562.X
[6]   CONSTRUCTION OF PHYLOGENETIC TREES [J].
FITCH, WM ;
MARGOLIASH, E .
SCIENCE, 1967, 155 (3760) :279-+
[7]   DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS [J].
FITCH, WM .
SYSTEMATIC ZOOLOGY, 1970, 19 (02) :99-&
[8]  
Gasteiger E, 2001, Curr Issues Mol Biol, V3, P47
[9]   Divergent evolution of enzymatic function: Mechanistically diverse superfamilies and functionally distinct suprafamilies [J].
Gerlt, JA ;
Babbitt, PC .
ANNUAL REVIEW OF BIOCHEMISTRY, 2001, 70 :209-246
[10]   Darwin v. 2.0: an interpreted computer language for the biosciences [J].
Gonnet, GH ;
Hallett, MT ;
Korostensky, C ;
Bernardin, L .
BIOINFORMATICS, 2000, 16 (02) :101-103