Automated assignment of SCOP and CATH protein structure classifications from FSSP scores

被引:30
作者
Getz, G
Vendruscolo, M
Sachs, D
Domany, E [1 ]
机构
[1] Weizmann Inst Sci, Dept Phys & Complex Syst, IL-76100 Rehovot, Israel
[2] Oxford Ctr Mol Sci, New Chem Lab, Oxford OX1 3QY, England
[3] Princeton Univ, Dept Phys, Princeton, NJ 08544 USA
关键词
protein structure; protein databases; CATH; FSSP; SCOP; classification; clustering;
D O I
10.1002/prot.1176
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present an automated procedure to assign CATH and SCOP classifications to proteins whose FSSP score is available. CATH classification is assigned down to the topology level, and SCOP classification is assigned to the fold level. Because the FSSP database is updated weekly, this method makes it possible to update also CATH and SCOP with the same frequency. Our predictions have a nearly perfect success rate when ambiguous cases are discarded. These ambiguous cases are intrinsic in any protein structure classification that relies on structural information alone. Hence, we introduce the "twilight zone for structure classification." We further suggest that to resolve these ambiguous cases, other criteria of classification, based also on information about sequence and function, must be used. (C) 2002 Wiley-Liss, Inc.
引用
收藏
页码:405 / 415
页数:11
相关论文
共 34 条
[1]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[2]   Structural genomics: an overview [J].
Blundell, TL ;
Mizuguchi, K .
PROGRESS IN BIOPHYSICS & MOLECULAR BIOLOGY, 2000, 73 (05) :289-295
[3]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[4]   The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues [J].
Bray, JE ;
Todd, AE ;
Pearl, FMG ;
Thornton, JM ;
Orengo, CA .
PROTEIN ENGINEERING, 2000, 13 (03) :153-165
[5]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[6]   A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3 [J].
Dietmann, S ;
Park, J ;
Notredame, C ;
Heger, A ;
Lappe, M ;
Holm, L .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :55-57
[7]   Protein structure: What is it possible to predict now? [J].
Finkelstein, AV .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1997, 7 (01) :60-71
[8]  
Fisher JO, 1996, FASEB J, V10, P1299
[9]   A structural census of the current population of protein sequences [J].
Gerstein, M ;
Levitt, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (22) :11911-11916
[10]  
GETZ G, 1998, THESIS TEL AVIV U