Improved residue contact prediction using support vector machines and a large feature set

被引:215
作者
Cheng, Jianlin [1 ]
Baldi, Pierre
机构
[1] Univ Cent Florida, Sch Elect Engn & Comp Sci, Orlando, FL 32816 USA
[2] Univ Calif Irvine, Sch Informat & Comp Sci, Irvine, CA 92617 USA
关键词
D O I
10.1186/1471-2105-8-113
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In spite of steady progress over the past decade, contact prediction remains still largely unsolved. Results: Here we develop a new contact map predictor (SVMcon) that uses support vector machines to predict medium- and long-range contacts. SVMcon integrates profiles, secondary structure, relative solvent accessibility, contact potentials, and other useful features. On the same test data set, SVMcon's accuracy is 4% higher than the latest version of the CMAPpro contact map predictor. SVMcon recently participated in the seventh edition of the Critical Assessment of Techniques for Protein Structure Prediction ( CASP7) experiment and was evaluated along with seven other contact map predictors. SVMcon was ranked as one of the top predictors, yielding the second best coverage and accuracy for contacts with sequence separation >= 12 on 13 de novo domains. Conclusion: We describe SVMcon, a new contact map predictor that uses SVMs and a large set of informative features. SVMcon yields good performance on medium- to long-range contact predictions and can be modularly incorporated into a structure prediction pipeline.
引用
收藏
页数:9
相关论文
共 61 条
[1]  
[Anonymous], 2004, KERNEL METHODS COMPU
[2]  
[Anonymous], 2004, Adv. Neural Inf. Process Syst
[3]  
[Anonymous], 2002, LEARNING KERNELS SUP
[4]  
[Anonymous], 1999, REPOSIT TU DORTMUND, DOI DOI 10.17877/DE290R-5098
[5]  
[Anonymous], 2003, handbook of chemoinformatics from data to knowledge
[6]   GLOBAL FOLD DETERMINATION FROM A SMALL NUMBER OF DISTANCE RESTRAINTS [J].
ASZODI, A ;
GRADWELL, MJ ;
TAYLOR, WR .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 251 (02) :308-326
[7]   Distill:: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins [J].
Bau, Davide ;
Martin, Alberto J. M. ;
Mooney, Catherine ;
Vullo, Alessandro ;
Walsh, Ian ;
Pollastri, Gianluca .
BMC BIOINFORMATICS, 2006, 7 (1)
[8]   Contact order and ab initio protein structure prediction [J].
Bonneau, R ;
Ruczinski, I ;
Tsai, J ;
Baker, D .
PROTEIN SCIENCE, 2002, 11 (08) :1937-1944
[9]   SCRATCH: a protein structure and structural feature prediction server [J].
Cheng, J ;
Randall, AZ ;
Sweredoski, MJ ;
Baldi, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W72-W76
[10]   A machine learning information retrieval approach to protein fold recognition [J].
Cheng, Jianlin ;
Baldi, Pierre .
BIOINFORMATICS, 2006, 22 (12) :1456-1463