Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles

被引:557
作者
Pollastri, G
Przybylski, D
Rost, B
Baldi, P [1 ]
机构
[1] Univ Calif Irvine, Dept Informat & Comp Sci, Inst Genom & Bioinformat, Irvine, CA 92697 USA
[2] Columbia Univ, Dept Biochem & Mol Biophys, CUBIC, New York, NY USA
关键词
recurrent neural networks; profiles; evolutionary information; PSI-BLAST;
D O I
10.1002/prot.10082
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein structure and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSI-BLAST-derived profiles, and a large nonredundant training set to derive two new predictors: (a) the second version of the SSpro program for secondary structure classification into three categories and (b) the first version of the SSpro8 program for secondary structure classification into the eight classes produced by the DSSP program. We describe the results of three different test sets on which SSpro achieved a sustained performance of about 78% correct prediction. We report confusion matrices, compare PSI-BLAST to BLAST-derived profiles, and assess the corresponding performance improvements. SSpro and SSpro8 are implemented as web servers, available together with other structural feature predictors at: http.1/promoter.ics.uci.edu/ BRNN-PRED/.
引用
收藏
页码:228 / 235
页数:8
相关论文
共 40 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [4] Exploiting the past and the future in protein secondary structure prediction
    Baldi, P
    Brunak, S
    Frasconi, P
    Soda, G
    Pollastri, G
    [J]. BIOINFORMATICS, 1999, 15 (11) : 937 - 946
  • [5] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [6] Baldi P, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P25
  • [7] Baldi P, 2001, BIOINFORMATICS MACHI
  • [8] BALDI P, 2001, IN PRESS INTELLIGENT
  • [9] PROTEIN SECONDARY STRUCTURE PREDICTION
    BARTON, GJ
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (03) : 372 - 376
  • [10] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242