PREDICTION OF PROTEIN-STRUCTURE BY EVALUATION OF SEQUENCE-STRUCTURE FITNESS - ALIGNING SEQUENCES TO CONTACT PROFILES DERIVED FROM 3-DIMENSIONAL STRUCTURES

被引:130
作者
OUZOUNIS, C
SANDER, C
SCHARF, M
SCHNEIDER, R
机构
[1] Protein Design Group, EMBL
关键词
PROTEIN STRUCTURE PREDICTION; SEQUENCE-STRUCTURE ALIGNMENT; COMPUTER ALGORITHM; DATABASE; EVOLUTIONARY INFORMATION;
D O I
10.1006/jmbi.1993.1433
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The problem of protein structure prediction is formulated here as that of evaluating how well an amino acid sequence fits a hypothetical structure. The simplest and most complicated approaches, secondary structure prediction and all-atom free energy calculations, can be viewed as sequence-structure fitness problems. Here, an approach of intermediate complexity is described, which involves; (1) description of a protein structure in terms of contact interface vectors, with both intra-protein and protein-solvent contacts counted, (2) derivation of sequence preferences for 2 up to 29 contact interface types, (3) generation of numerous hypothetical model structures by placing the input sequence into a large set of known three-dimensional structures in all possible alignments, (4) evaluation of these models by summing the sequence preferences over all structural positions and (5) choice of predicted three-dimensional structure as that with the best sequence-structure fitness. Evolutionary information is incorporated by using position-dependent core weights derived from multiple sequence alignments. A number of tests of the method are performed: (1) evaluation of cyclic shifts of a sequence in its native structure; (2) alignment of a sequence in its native structure, allowing gaps; (3) alignment search with a sequence or sequence fragment in a database of structures; and (4) alignment search with a structure in a database of sequences. The main results are: (1) a native sequence can very well find its native structure among a large number of alternatives, in correct alignment; (2) substructures, such as (βα)n units, can be detected in spite of very low sequence similarity; (3) remote homologues can be detected, with some dependence on the set of parameters used; (4) contact interface parameters are clearly superior to classical secondary structure parameters; (5) a simple interface description in terms of just two states, protein-protein and protein-water contacts, performs surprisingly well; (6) the use of core weights considerably improves accuracy in detection of remote homologues; (7) based on a sequence database search with a myoglobin contact profile, the C-terminal domain of a viral origin of replication binding protein is predicted to have an all-helical fold. The sequence-structure fitness concept is sufficiently general to accommodate a large variety of protein structure prediction methods, including new models of intermediate complexity currently being developed.
引用
收藏
页码:805 / 825
页数:21
相关论文
共 64 条
[1]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1991, 19 :2247-2248
[2]   DETERMINANTS OF A PROTEIN FOLD - UNIQUE FEATURES OF THE GLOBIN AMINO-ACID-SEQUENCES [J].
BASHFORD, D ;
CHOTHIA, C ;
LESK, AM .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (01) :199-216
[3]   POLARITY AS A CRITERION IN PROTEIN DESIGN [J].
BAUMANN, G ;
FROMMEL, C ;
SANDER, C .
PROTEIN ENGINEERING, 1989, 2 (05) :329-334
[4]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[5]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[6]   IDENTIFICATION OF PROTEIN FOLDS - MATCHING HYDROPHOBICITY PATTERNS OF SEQUENCE SETS WITH SOLVENT ACCESSIBILITY PATTERNS OF KNOWN STRUCTURES [J].
BOWIE, JU ;
CLARKE, ND ;
PABO, CO ;
SAUER, RT .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1990, 7 (03) :257-264
[7]  
BRYANT SH, 1987, INT J PEPT PROT RES, V29, P46
[8]   STRUCTURE-DERIVED HYDROPHOBIC POTENTIAL - HYDROPHOBIC POTENTIAL DERIVED FROM X-RAY STRUCTURES OF GLOBULAR-PROTEINS IS ABLE TO IDENTIFY NATIVE FOLDS [J].
CASARI, G ;
SIPPL, MJ .
JOURNAL OF MOLECULAR BIOLOGY, 1992, 224 (03) :725-732
[9]   PROTEIN MODEL STRUCTURE EVALUATION USING THE SOLVATION FREE-ENERGY OF FOLDING [J].
CHICHE, L ;
GREGORET, LM ;
COHEN, FE ;
KOLLMAN, PA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (08) :3240-3243
[10]  
Chou P Y, 1978, Adv Enzymol Relat Areas Mol Biol, V47, P45