Prediction of MHC class I binding peptides, using SVMHC -: art. no. 25

被引:205
作者
Dönnes, P
Elofsson, A [1 ]
机构
[1] Stockholm Univ, SCFAB, Stockholm Bioinformat Ctr, SE-10691 Stockholm, Sweden
[2] Univ Saarland, Ctr Bioinformat Saar, D-66041 Saarbrucken, Germany
关键词
MHC class I; peptide prediction; machine learning; support vector machines;
D O I
10.1186/1471-2105-3-25
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: T-cells are key players in regulating a specific immune response. Activation of cytotoxic T-cells requires recognition of specific peptides bound to Major Histocompatibility Complex (MHC) class I molecules. MHC-peptide complexes are potential tools for diagnosis and treatment of pathogens and cancer, as well as for the development of peptide vaccines. Only one in 100 to 200 potential binders actually binds to a certain MHC molecule, therefore a good prediction method for MHC class I binding peptides can reduce the number of candidate binders that need to be synthesized and tested. Results: Here, we present a novel approach, SVMHC, based on support vector machines to predict the binding of peptides to MHC class I molecules. This method seems to perform slightly better than two profile based methods, SYFPEITHI and HLA_BIND. The implementation of SVMHC is quite simple and does not involve any manual steps, therefore as more data become available it is trivial to provide prediction for more MHC types. SVMHC currently contains prediction for 26 MHC class I types from the MHCPEP database or alternatively 6 MHC class I types from the higher quality SYFPEITHI database. The prediction models for these MHC types are implemented in a public web service available at [http://www.sbc.su.se/svmhc/]. Conclusions: Prediction of MHC class I binding peptides using Support Vector Machines, shows high performance and is easy to apply to a large number of MHC class I types. As more peptide data are put into MHC databases, SVMHC can easily be updated to give prediction for additional MHC class I types. We suggest that the number of binding peptides needed for SVM training is at least 20 sequences.
引用
收藏
页数:8
相关论文
共 24 条
[1]  
[Anonymous], COMPLEX SYSTEMS MECH
[2]  
Baldi P., 1998, Bioinformatics: The machine learning approach
[3]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[4]   MHCPEP, a database of MHC-binding peptides: update 1997 [J].
Brusic, V ;
Rudy, G ;
Harrison, LC .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :368-371
[5]  
Cristianini N., 2000, INTRO SUPPORT VECTOR, DOI [10.1017/CBO9780511801389, DOI 10.1017/CBO9780511801389]
[6]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[7]   PROFILE ANALYSIS - DETECTION OF DISTANTLY RELATED PROTEINS [J].
GRIBSKOV, M ;
MCLACHLAN, AD ;
EISENBERG, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (13) :4355-4358
[8]   Two complementary methods for predicting peptides binding major histocompatibility complex molecules [J].
Gulukota, K ;
Sidney, J ;
Sette, A ;
DeLisi, C .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 267 (05) :1258-1267
[9]   Neural network-based prediction of candidate T-cell epitopes [J].
Honeyman, MC ;
Brusic, V ;
Stone, NL ;
Harrison, LC .
NATURE BIOTECHNOLOGY, 1998, 16 (10) :966-969
[10]   The Ensembl genome database project [J].
Hubbard, T ;
Barker, D ;
Birney, E ;
Cameron, G ;
Chen, Y ;
Clark, L ;
Cox, T ;
Cuff, J ;
Curwen, V ;
Down, T ;
Durbin, R ;
Eyras, E ;
Gilbert, J ;
Hammond, M ;
Huminiecki, L ;
Kasprzyk, A ;
Lehvaslaiho, H ;
Lijnzaad, P ;
Melsopp, C ;
Mongin, E ;
Pettett, R ;
Pocock, M ;
Potter, S ;
Rust, A ;
Schmidt, E ;
Searle, S ;
Slater, G ;
Smith, J ;
Spooner, W ;
Stabenau, A ;
Stalker, J ;
Stupka, E ;
Ureta-Vidal, A ;
Vastrik, I ;
Clamp, M .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :38-41