The COG database: a tool for genome-scale analysis of protein functions and evolution

被引:3408
作者
Tatusov, RL [1 ]
Galperin, MY [1 ]
Natale, DA [1 ]
Koonin, EV [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
关键词
D O I
10.1093/nar/28.1.33
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Rational classification of proteins encoded sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies, The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www.ncbi.nlm.nih.gov/COG), The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes, The database comprises 2091 COGs that include 56-83% of the gene products from each of the complete bacterial and archaeal genomes and similar to 35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes.
引用
收藏
页码:33 / 36
页数:4
相关论文
共 13 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Phylogenetic classification and the universal tree
    Doolittle, WF
    [J]. SCIENCE, 1999, 284 (5423) : 2124 - 2128
  • [3] USES FOR EVOLUTIONARY TREES
    FITCH, WM
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY OF LONDON SERIES B-BIOLOGICAL SCIENCES, 1995, 349 (1327) : 93 - 102
  • [4] DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS
    FITCH, WM
    [J]. SYSTEMATIC ZOOLOGY, 1970, 19 (02): : 99 - &
  • [5] Gene families: The taxonomy of protein paralogs and chimeras
    Henikoff, S
    Greene, EA
    Pietrokovski, S
    Bork, P
    Attwood, TK
    Hood, L
    [J]. SCIENCE, 1997, 278 (5338) : 609 - 614
  • [6] Huynen MA, 1997, TRENDS GENET, V13, P389
  • [7] Genome sequences: Genome sequence of a model prokaryote
    Koonin, EV
    [J]. CURRENT BIOLOGY, 1997, 7 (10) : R656 - R659
  • [8] Beyond complete genomes: from sequence to structure and function
    Koonin, EV
    Tatusov, RL
    Galperin, MY
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) : 355 - 363
  • [9] Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea
    Koonin, EV
    Mushegian, AR
    Galperin, MY
    Walker, DR
    [J]. MOLECULAR MICROBIOLOGY, 1997, 25 (04) : 619 - 637
  • [10] NEIDHARDT FC, 1996, ESCHERICHIA CLI SALM