Secondary structure prediction with support vector machines

被引:164
作者
Ward, JJ [1 ]
McGuffin, LJ [1 ]
Buxton, BF [1 ]
Jones, DT [1 ]
机构
[1] UCL, Dept Comp Sci, Bioinformat Grp, London WC1E 6BT, England
基金
英国医学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btg223
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A new method that uses support vector machines (SVMs) to predict protein secondary structure is described and evaluated. The study is designed to develop a reliable prediction method using an alternative technique and to investigate the applicability of SVMs to this type of bioinformatics problem. Methods: Binary SVMs are trained to discriminate between two structural classes. The binary classifiers are combined in several ways to predict multi-class secondary structure. Results: The average three-state prediction accuracy per protein (Q(3)) is estimated by cross-validation to be 77.07+/-0.26% with a segment overlap (Sov) score of 73.32+/-0.39%. The SVM performs similarly to the 'state-of-the-art' PSIPRED prediction method on a non-homologous test set of 121 proteins despite being trained on substantially fewer examples. A simple consensus of the SVM, PSIPRED and PROFsec achieves significantly higher prediction accuracy than the individual methods.
引用
收藏
页码:1650 / 1655
页数:6
相关论文
共 29 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], ADV LARGE MARGIN CLA
  • [3] Bishop C. M., 1995, NEURAL NETWORKS PATT
  • [4] Knowledge-based analysis of microarray gene expression data by using support vector machines
    Brown, MPS
    Grundy, WN
    Lin, D
    Cristianini, N
    Sugnet, CW
    Furey, TS
    Ares, M
    Haussler, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
  • [5] Burges C. J. C., 1997, ADV NEURAL INFORM PR, V9
  • [6] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [7] Christianini N., 2000, INTRO SUPPORT VECTOR, DOI DOI 10.1017/CBO9780511801389
  • [8] Cuff JA, 1999, PROTEINS, V34, P508, DOI 10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO
  • [9] 2-4
  • [10] Exact simplification of support vector solutions
    Downs, T
    Gates, KE
    Masters, A
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) : 293 - 297