A genomic perspective on protein families

被引:2746
作者
Tatusov, RL [1 ]
Koonin, EV [1 ]
Lipman, DJ [1 ]
机构
[1] NIH, NATL CTR BIOTECHNOL INFORMAT, NATL LIB MED, BETHESDA, MD 20894 USA
关键词
D O I
10.1126/science.278.5338.631
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In order to extract the maximum amount of information from the rapidly accumulating genome sequences, all conserved genes need to be classified according to their homologous relationships. Comparison of proteins encoded in seven complete genomes from five major phylogenetic lineages and elucidation of consistent patterns of sequence similarities allowed the delineation of 720 clusters of orthologous groups (COGs). Each COG consists of individual orthologous proteins or orthologous sets of paralogs from at least three lineages. Orthologs typically have the same function, allowing transfer of functional information from one member to an entire COG. This relation automatically yields a number of functional predictions for poorly characterized genomes. The COGs comprise a framework for functional and evolutionary genome analysis.
引用
收藏
页码:631 / 637
页数:7
相关论文
共 63 条
[1]   A CARBONIC-ANHYDRASE FROM THE ARCHAEON METHANOSARCINA-THERMOPHILA [J].
ALBER, BE ;
FERRY, JG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (15) :6909-6913
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[4]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[5]   The protein phosphatase 2C (PP2C) superfamily: Detection of bacterial homologues [J].
Bork, P ;
Brown, NP ;
Hegyi, H ;
Schultz, J .
PROTEIN SCIENCE, 1996, 5 (07) :1421-1425
[6]   Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii [J].
Bult, CJ ;
White, O ;
Olsen, GJ ;
Zhou, LX ;
Fleischmann, RD ;
Sutton, GG ;
Blake, JA ;
FitzGerald, LM ;
Clayton, RA ;
Gocayne, JD ;
Kerlavage, AR ;
Dougherty, BA ;
Tomb, JF ;
Adams, MD ;
Reich, CI ;
Overbeek, R ;
Kirkness, EF ;
Weinstock, KG ;
Merrick, JM ;
Glodek, A ;
Scott, JL ;
Geoghagen, NSM ;
Weidman, JF ;
Fuhrmann, JL ;
Nguyen, D ;
Utterback, TR ;
Kelley, JM ;
Peterson, JD ;
Sadow, PW ;
Hanna, MC ;
Cotton, MD ;
Roberts, KM ;
Hurst, MA ;
Kaine, BP ;
Borodovsky, M ;
Klenk, HP ;
Fraser, CM ;
Smith, HO ;
Woese, CR ;
Venter, JC .
SCIENCE, 1996, 273 (5278) :1058-1073
[7]   Essential yeast protein with unexpected similarity to subunits of mammalian cleavage and polyadenylation specificity factor (CPSF) [J].
Chanfreau, G ;
Noble, SM ;
Guthrie, C .
SCIENCE, 1996, 274 (5292) :1511-1514
[8]   The first genome from the third domain of life [J].
Clayton, RA ;
White, O ;
Ketchum, KA ;
Venter, JC .
NATURE, 1997, 387 (6632) :459-462
[9]   ORGANIZATION OF THE GENES NECESSARY FOR HYDROGENASE EXPRESSION IN RHODOBACTER-CAPSULATUS - SEQUENCE-ANALYSIS AND IDENTIFICATION OF 2 HYP REGULATORY MUTANTS [J].
COLBEAU, A ;
RICHAUD, P ;
TOUSSAINT, B ;
CABALLERO, FJ ;
ELSTER, C ;
DELPHIN, C ;
SMITH, RL ;
CHABERT, J ;
VIGNAIS, PM .
MOLECULAR MICROBIOLOGY, 1993, 8 (01) :15-29
[10]   Determining divergence times of the major kingdoms of living organisms with a protein clock [J].
Doolittle, RF ;
Feng, DF ;
Tsang, S ;
Cho, G ;
Little, E .
SCIENCE, 1996, 271 (5248) :470-477