The CATH database provides insights into protein structure/function relationships

被引:126
作者
Orengo, CA [1 ]
Pearl, FMG [1 ]
Bray, JE [1 ]
Todd, AE [1 ]
Martin, AC [1 ]
Lo Conte, L [1 ]
Thornton, JM [1 ]
机构
[1] UCL, Dept Biochem & Mol Biol, London WC1E 6BT, England
关键词
D O I
10.1093/nar/27.1.275
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We report the latest release (version 1.4) of the CATH protein domains database (http://www.biochem.ucl. ac.uk/bsm/cath). This is a hierarchical classification of 13 359 protein domain structures into evolutionary families and structural groupings. We currently identify 827 homologous families in which the proteins have both structual similarity end sequence and/or functional similarity. These can be further clustered into 593 fold groups end 32 distinct architectures. Using our structural classification and associated data on protein functions, stored in the database (EC identifiers, SWISS-PROT keywords and information from the Enzyme database and literature) we have been able to analyse the correlation between the 3D structure and function. More than 96% of folds in the PDB are associated with a single homologous family. However, within the superfolds, three or more different functions are observed. Considering enzyme functions, more than 95% of clearly homologous families exhibit either single or closely related functions, as demonstrated by the EC identifiers of their relatives. Our analysis supports the view that determining structures, for example as part of a 'structural genomics' initiative, will make a major contribution to interpreting genome data.
引用
收藏
页码:275 / 279
页数:5
相关论文
共 20 条
  • [1] ABOLA EE, 1987, CRYSTALLOGRAPHIC DAT, P107
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] PROTEINS - 1000 FAMILIES FOR THE MOLECULAR BIOLOGIST
    CHOTHIA, C
    [J]. NATURE, 1992, 357 (6379) : 543 - 544
  • [4] Homology-based fold predictions for Mycoplasma genitalium proteins
    Huynen, M
    Doerks, T
    Eisenhaber, F
    Orengo, C
    Sunyaev, S
    Yuan, YP
    Bork, P
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 280 (03) : 323 - 326
  • [5] JONES DT, 1998, IN PRESS J MOL BIOL
  • [6] Jones S, 1998, PROTEIN SCI, V7, P233
  • [7] Prediction of protein-protein interaction sites using patch analysis
    Jones, S
    Thornton, JM
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 272 (01) : 133 - 143
  • [8] Laskowski RA, 1996, PROTEIN SCI, V5, P2438
  • [9] PDBsum: a Web-based database of summaries and analyses of all PDB structures
    Laskowski, RA
    Hutchinson, EG
    Michie, AD
    Wallace, AC
    Jones, ML
    Thornton, JM
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1997, 22 (12) : 488 - 490
  • [10] STRUCTURAL PATTERNS IN GLOBULAR PROTEINS
    LEVITT, M
    CHOTHIA, C
    [J]. NATURE, 1976, 261 (5561) : 552 - 558