COMBINING EVOLUTIONARY INFORMATION AND NEURAL NETWORKS TO PREDICT PROTEIN SECONDARY STRUCTURE

被引:1332
作者
ROST, B
SANDER, C
机构
[1] Protein Design Group, European Molecular Biology Laboratory, Heidelberg
关键词
SECONDARY STRUCTURE PREDICTION; PREDICTION OF SECONDARY STRUCTURE CLASS; PREDICTION OF SECONDARY STRUCTURE CONTENT; EVOLUTIONARY INFORMATION; MULTIPLE ALIGNMENT PROFILES;
D O I
10.1002/prot.340190108
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments.(C) 1994 Wiley-Liss, Inc.
引用
收藏
页码:55 / 72
页数:18
相关论文
共 135 条
  • [1] ABOLA EE, 1988, PROTEIN DATA BANK, P69
  • [2] EVALUATION OF SECONDARY STRUCTURE OF PROTEINS FROM UV CIRCULAR-DICHROISM SPECTRA USING AN UNSUPERVISED LEARNING NEURAL-NETWORK
    ANDRADE, MA
    CHACON, P
    MERELO, JJ
    MORAN, F
    [J]. PROTEIN ENGINEERING, 1993, 6 (04): : 383 - 390
  • [3] PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS
    ANFINSEN, CB
    [J]. SCIENCE, 1973, 181 (4096) : 223 - 230
  • [4] THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK
    BAIROCH, A
    BOECKMANN, B
    [J]. NUCLEIC ACIDS RESEARCH, 1992, 20 : 2019 - 2022
  • [5] AMINO-ACID-SEQUENCE ANALYSIS OF THE ANNEXIN SUPERGENE FAMILY OF PROTEINS
    BARTON, GJ
    NEWMAN, RH
    FREEMONT, PS
    CRUMPTON, MJ
    [J]. EUROPEAN JOURNAL OF BIOCHEMISTRY, 1991, 198 (03): : 749 - 760
  • [6] POLARITY AS A CRITERION IN PROTEIN DESIGN
    BAUMANN, G
    FROMMEL, C
    SANDER, C
    [J]. PROTEIN ENGINEERING, 1989, 2 (05): : 329 - 334
  • [7] PREDICTED SECONDARY STRUCTURE FOR THE SRC HOMOLOGY-3 DOMAIN
    BENNER, SA
    COHEN, MA
    GERLOFF, D
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1993, 229 (02) : 295 - 305
  • [8] PATTERNS OF DIVERGENCE IN HOMOLOGOUS PROTEINS AS INDICATORS OF SECONDARY AND TERTIARY STRUCTURE - A PREDICTION OF THE STRUCTURE OF THE CATALYTIC DOMAIN OF PROTEIN-KINASES
    BENNER, SA
    GERLOFF, D
    [J]. ADVANCES IN ENZYME REGULATION, 1991, 31 : 121 - 181
  • [9] PREDICTING THE CONFORMATION OF PROTEINS - MAN VERSUS MACHINE
    BENNER, SA
    GERLOFF, DL
    [J]. FEBS LETTERS, 1993, 325 (1-2): : 29 - 33
  • [10] CORRECT STRUCTURE PREDICTION
    BENNER, SA
    COHEN, MA
    GERLOFF, D
    [J]. NATURE, 1992, 359 (6398) : 781 - 781