Exploring the limits of nearest neighbour secondary structure prediction

被引:72
作者
Levin, JM
机构
[1] Unité de Bioinformatique, Bat. de Biotechnologie, INRA
来源
PROTEIN ENGINEERING | 1997年 / 10卷 / 07期
关键词
amino acid; prediction; protein; secondary structure; sequence;
D O I
10.1093/protein/10.7.771
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This paper presents a simple and robust secondary structure prediction scheme (SIMPA96) based on an updated version of the nearest neighbour method. Using a larger database of known structures, the Blosum 62 substitution matrix and a regularization algorithm, the three state prediction accuracy is increased by 4.7 percentage points to 67.7% for a single sequence and up to 72.8% when using multiple alignments. The increase in prediction accuracy with respect to the previous version can be almost entirely ascribed to the sevenfold increase in the size of the database. A more detailed analysis of the results shows that badly predicted regions of a protein sequence are randomly distributed throughout the database and that the goal of perfect secondary structure predictions by methods which use only local sequence information is illusory.
引用
收藏
页码:771 / 776
页数:6
相关论文
共 19 条
[1]   PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS [J].
ANFINSEN, CB .
SCIENCE, 1973, 181 (4096) :223-230
[2]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[3]   COMPREHENSIVE SEQUENCE-ANALYSIS OF THE 182 PREDICTED OPEN READING FRAMES OF YEAST CHROMOSOME-III [J].
BORK, P ;
OUZOUNIS, C ;
SANDER, C ;
SCHARF, M ;
SCHNEIDER, R ;
SONNHAMMER, E .
PROTEIN SCIENCE, 1992, 1 (12) :1677-1690
[4]   COMPARISON OF 3 ALGORITHMS FOR THE ASSIGNMENT OF SECONDARY STRUCTURE IN PROTEINS - THE ADVANTAGES OF A CONSENSUS ASSIGNMENT [J].
COLLOCH, N ;
ETCHEBEST, C ;
THOREAU, E ;
HENRISSAT, B ;
MORNON, JP .
PROTEIN ENGINEERING, 1993, 6 (04) :377-382
[5]   GENETIC CONTROL OF TERTIARY PROTEIN STUCTURE - STUDIES WITH MODEL SYSTEMS [J].
EPSTEIN, CJ ;
GOLDBERGER, RF ;
ANFINSEN, CB .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 1963, 28 :439-&
[6]   Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence [J].
Frishman, D ;
Argos, P .
PROTEIN ENGINEERING, 1996, 9 (02) :133-142
[7]  
Garnier J, 1996, METHOD ENZYMOL, V266, P540
[8]   POSITION-BASED SEQUENCE WEIGHTS [J].
HENIKOFF, S ;
HENIKOFF, JG .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 243 (04) :574-578
[9]   AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS [J].
HENIKOFF, S ;
HENIKOFF, JG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) :10915-10919
[10]   DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].
KABSCH, W ;
SANDER, C .
BIOPOLYMERS, 1983, 22 (12) :2577-2637