WIDESPREAD PROTEIN-SEQUENCE SIMILARITIES - ORIGINS OF ESCHERICHIA-COLI GENES

被引:37
作者
LABEDAN, B
RILEY, M
机构
[1] MARINE BIOL LAB, WOODS HOLE, MA 02543 USA
[2] UNIV PARIS 11, INST MICROBIOL & GENET, F-91405 ORSAY, FRANCE
关键词
D O I
10.1128/jb.177.6.1585-1588.1995
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
To learn more about the evolutionary origins of Escherichia coli genes, we surveyed systematically for extended sequence similarities among the 1,264 amino acid sequences encoded by chromosomal genes of E. coli R-12 in SwissProt release 26 by using the FASTA program and imposing the following criteria: (i) alignment of, segments at least 100 amino acids long and (ii) at least 20% amino acid identity. Altogether, 624 extended alignments meeting the two criteria were identified, corresponding to 577 protein sequences (45.6% of the 1,264 E. coli protein sequences) that had an extended alignment with at least one other E. coli protein sequence. To exclude alignments of questionable biological significance, we imposed a high threshold on the number of gaps allowed in each of the 624 extended alignments, giving us a subset of 464 proteins. The population of 464 alignments has the following characteristics expressed as median values of the group: 254 amino acids in the alignment, representing 86% of the length of the protein, 33% of the amino acids in the alignment being identical, and 1.1 gaps introduced per 100 amino acids of alignment. Where functions are known, nearly all pairs consist of functionally related proteins. This implies that the sequence similarity we detected has biological meaning and did not arise by chance. That a major fraction of E. coli proteins form extended alignments strongly suggests the predominance of duplication and divergence of ancestral genes in the evolution of E. coli genes. The range of degrees of similarity shows that some genes originated more recently than others. There is no evidence of genome doubling in the past, since map distances between genes of sequence-related proteins show no coherent pattern of favored separations.
引用
收藏
页码:1585 / 1588
页数:4
相关论文
共 13 条
[1]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK, RECENT DEVELOPMENTS [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1993, 21 (13) :3093-3096
[2]   SIMILAR AMINO-ACID-SEQUENCES - CHANCE OR COMMON ANCESTRY [J].
DOOLITTLE, RF .
SCIENCE, 1981, 214 (4517) :149-159
[3]  
DOOLITTLE RF, 1992, PROTEIN SCI, V1, P191
[4]   RELATIONSHIPS OF HUMAN PROTEIN SEQUENCES TO THOSE OF OTHER ORGANISMS [J].
DOOLITTLE, RF ;
FENG, DF ;
JOHNSON, MS ;
MCCLURE, MA .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 1986, 51 :447-455
[5]   EFFICIENT PATTERN COMPARATIVE METHOD FOR SELECTING FUNCTIONALLY IMPORTANT MOTIFS IN PROTEIN SEQUENCES - APPLICATION TO ZINC ENZYMES [J].
KISTER, A ;
MUCHNIK, I ;
BOUZIDA, D ;
REINHERZ, EL ;
SMITH, T .
BIOSYSTEMS, 1993, 30 (1-3) :233-240
[6]   PSEUDOALLELISM AND GENE EVOLUTION [J].
LEWIS, EB .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 1951, 16 :159-174
[7]   OLIGOPEPTIDE BIASES IN PROTEIN SEQUENCES AND THEIR USE IN PREDICTING PROTEIN CODING REGIONS IN NUCLEOTIDE-SEQUENCES [J].
MCCALDON, P ;
ARGOS, P .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1988, 4 (02) :99-122
[8]   IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON [J].
PEARSON, WR ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (08) :2444-2448
[9]   FUNCTIONS OF THE GENE-PRODUCTS OF ESCHERICHIA-COLI [J].
RILEY, M .
MICROBIOLOGICAL REVIEWS, 1993, 57 (04) :862-952
[10]  
RUDD KE, 1993, ASM NEWS, V59, P335