Learning a peptide-protein binding affinity predictor with kernel ridge regression

被引:32
作者
Giguere, Sebastien [1 ]
Marchand, Mario [1 ]
Laviolette, Francois [1 ]
Drouin, Alexandre [1 ]
Corbeil, Jacques [2 ]
机构
[1] Univ Laval, Dept Comp Sci & Software Engn, Quebec City, PQ, Canada
[2] Univ Laval, Dept Mol Med, Quebec City, PQ, Canada
来源
BMC BIOINFORMATICS | 2013年 / 14卷
基金
加拿大自然科学与工程研究理事会; 加拿大创新基金会;
关键词
STRING KERNELS; SYSTEMS; MODEL;
D O I
10.1186/1471-2105-14-82
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The cellular function of a vast majority of proteins is performed through physical interactions with other biomolecules, which, most of the time, are other proteins. Peptides represent templates of choice for mimicking a secondary structure in order to modulate protein-protein interaction. They are thus an interesting class of therapeutics since they also display strong activity, high selectivity, low toxicity and few drug-drug interactions. Furthermore, predicting peptides that would bind to a specific MHC alleles would be of tremendous benefit to improve vaccine based therapy and possibly generate antibodies with greater affinity. Modern computational methods have the potential to accelerate and lower the cost of drug and vaccine discovery by selecting potential compounds for testing in silico prior to biological validation. Results: We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalizes eight kernels, comprised of the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of predicting the binding affinity of any peptide to any protein with reasonable accuracy. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. Conclusion: On all benchmarks, our method significantly (p-value <= 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. Moreover, generating reliable peptide-protein binding affinities will also improve system biology modelling of interaction pathways. Lastly, the method should be of value to a large segment of the research community with the potential to accelerate the discovery of peptide-based drugs and facilitate vaccine development. The proposed kernel is freely available at http://graal.ift.ulaval.ca/downloads/gs-kernel/.
引用
收藏
页数:16
相关论文
共 36 条
[11]   Efficient peptideMHC-I binding prediction for alleles with few known binders [J].
Jacob, Laurent ;
Vert, Jean-Philippe .
BIOINFORMATICS, 2008, 24 (03) :358-366
[12]   Virtual screening of GPCRs:: An in silico chemogenomics approach [J].
Jacob, Laurent ;
Hoffmann, Brice ;
Stoven, Veronique ;
Vert, Jean-Philippe .
BMC BIOINFORMATICS, 2008, 9 (1)
[13]   Mismatch string kernels for discriminative protein classification [J].
Leslie, CS ;
Eskin, E ;
Cohen, A ;
Weston, J ;
Noble, WS .
BIOINFORMATICS, 2004, 20 (04) :467-476
[14]   Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites [J].
Meinicke, P ;
Tech, M ;
Morgenstern, B ;
Merkl, R .
BMC BIOINFORMATICS, 2004, 5 (1)
[15]   Statistical prediction of protein-chemical interactions based on chemical structure and mass spectrometry data [J].
Nagamine, Nobuyoshi ;
Sakakibara, Yasuburni .
BIOINFORMATICS, 2007, 23 (15) :2004-2012
[16]  
Nielsen Morten, 2010, Immunome Res, V6, P9, DOI 10.1186/1745-7580-6-9
[17]   Quantitative Predictions of Peptide Binding to Any HLA-DR Molecule of Known Sequence: NetMHCIIpan [J].
Nielsen, Morten ;
Lundegaard, Claus ;
Blicher, Thomas ;
Peters, Bjoern ;
Sette, Alessandro ;
Justesen, Sune ;
Buus, Soren ;
Lund, Ole .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (07)
[18]   MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison [J].
Ortiz, AR ;
Strauss, CEM ;
Olmea, O .
PROTEIN SCIENCE, 2002, 11 (11) :2606-2621
[19]  
Perez-De-Vega JM, 2007, CURR TOP MED CHEM, V7, P33
[20]   The immune epitope database and analysis resource: From vision to blueprint [J].
Peters, B ;
Sidney, J ;
Bourne, P ;
Bui, HH ;
Buus, S ;
Doh, G ;
Fleri, W ;
Kronenberg, M ;
Kubo, R ;
Lund, O ;
Nemazee, D ;
Ponomarenko, JV ;
Sathiamurthy, M ;
Schoenberger, S ;
Stewart, S ;
Surko, P ;
Way, S ;
Wilson, S ;
Sette, A .
PLOS BIOLOGY, 2005, 3 (03) :379-381