THREADING A DATABASE OF PROTEIN CORES

被引:331
作者
MADEJ, T [1 ]
GIBRAT, JF [1 ]
BRYANT, SH [1 ]
机构
[1] NATL LIB MED, NATL CTR BIOTECHNOL INFORMAT, COMPUTAT BIOL BRANCH, BETHESDA, MD 20894 USA
关键词
STRUCTURE PREDICTION; FOLD RECOGNITION; PROTEIN THREADING;
D O I
10.1002/prot.340230309
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present an analysis of 10 blind predictions prepared for a recent conference, ''Critical Assessment of Techniques for Protein Structure Prediction.''(1) The sequences of these proteins are not detectably similar to those of any protein in the structure database then available, but we attempted, by a threading method, to recognize similarity to known domain folds. Four of the 10 proteins, as we subsequently learned, do indeed show significant similarity to then-known structures. For 2 of these proteins the predictions were accurate, in the sense that a similar structure was at or near the top of the list of threading scores, and the threading alignment agreed well with the corresponding structural alignment. For the best predicted model mean alignment error relative to the optimal structural alignment was 2.7 residues, arising entirely from small ''register shifts'' of strands or helices. In the analysis we attempt to identify factors responsible for these successes and failures, Since our threading method does not use gap penalties, we may readily distinguish between errors arising from our prior definition of the ''cores'' of known structures and errors arising from inherent limitations in the threading potential, It would appear from the results that successful substructure recognition depends most critically on accurate definition of the ''fold'' of a database protein. This definition must correctly delineate substructures that are, and are not, likely to be conserved during protein evolution. (C) 1995 Wiley-Liss, Inc.*
引用
收藏
页码:356 / 369
页数:14
相关论文
共 42 条
  • [1] RECOGNITION OF DISTANTLY RELATED PROTEINS THROUGH ENERGY CALCULATIONS
    ABAGYAN, R
    FRISHMAN, D
    ARGOS, P
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 19 (02): : 132 - 140
  • [2] ABOLA EE, 1987, BONN DATA COMMISSION, P107
  • [3] ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES
    ALTSCHUL, SF
    BOGUSKI, MS
    GISH, W
    WOOTTON, JC
    [J]. NATURE GENETICS, 1994, 6 (02) : 119 - 129
  • [4] BARBER MJ, 1992, J BIOL CHEM, V267, P6611
  • [5] STATISTICS OF SEQUENCE-STRUCTURE THREADING
    BRYANT, SH
    ALTSCHUL, SF
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (02) : 236 - 244
  • [6] AN EMPIRICAL ENERGY FUNCTION FOR THREADING PROTEIN-SEQUENCE THROUGH THE FOLDING MOTIF
    BRYANT, SH
    LAWRENCE, CE
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 16 (01) : 92 - 112
  • [7] PKB - A PROGRAM SYSTEM AND DATA-BASE FOR ANALYSIS OF PROTEIN-STRUCTURE
    BRYANT, SH
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1989, 5 (03): : 233 - 247
  • [8] BRYANT SJ, UNPUB
  • [9] THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS
    CHOTHIA, C
    LESK, AM
    [J]. EMBO JOURNAL, 1986, 5 (04) : 823 - 826
  • [10] NEW PROGRAMS FOR PROTEIN TERTIARY STRUCTURE PREDICTION
    FETROW, JS
    BRYANT, SH
    [J]. BIO-TECHNOLOGY, 1993, 11 (04): : 479 - 484