Isoelectric point optimization using peptide descriptors and support vector machines

被引:30
作者
Perez-Riverol, Yasset [1 ,4 ]
Audain, Enrique [2 ]
Millan, Aleli [1 ]
Ramos, Yassel [1 ]
Sanchez, Aniel [1 ]
Vizcaino, Juan Antonio [4 ]
Wang, Rui [4 ]
Mueller, Markus [3 ]
Machado, Yoan J. [2 ]
Betancourt, Lazaro H. [1 ]
Gonzalez, Luis J. [1 ]
Padron, Gabriel [1 ]
Besada, Vladimir [1 ]
机构
[1] Ctr Genet Engn & Biotechnol, Dept Prote, Havana, Cuba
[2] Ctr Mol Immunol, Dept Prote, Havana, Cuba
[3] Swiss Inst Bioinformat, Proteome Informat Grp, CH-1211 Geneva, Switzerland
[4] European Bioinformat Inst, EMBL Outstn, Cambridge, England
关键词
Isoelectric point; Support vector machine; Peptide descriptors; TANDEM MASS-SPECTROMETRY; IMMOBILIZED PH GRADIENTS; AMINO-ACID-SEQUENCES; SHOTGUN PROTEOMICS; PREDICTION; ACCURACY; PROTEINS; IDENTIFICATION; DATABASE;
D O I
10.1016/j.jprot.2012.01.029
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
IPG (Immobilized pH Gradient) based separations are frequently used as the first step in shotgun proteomics methods; it yields an increase in both the dynamic range and resolution of peptide separation prior to the LC-MS analysis. Experimental isoelectric point (pI) values can improve peptide identifications in conjunction with MS/MS information. Thus, accurate estimation of the pI value based on the amino acid sequence becomes critical to perform these kinds of experiments. Nowadays, pI is commonly predicted using the charge-state model [1], and/or the cofactor algorithm [2]. However, none of these methods is capable of calculating the pI value for basic peptides accurately. In this manuscript, we present an new approach that can significant improve the pI estimation, by using Support Vector Machines (SVM)[3], an experimental amino acid descriptor taken from the AAIndex database [4] and the isoelectric point predicted by the charge-state model. Our results have shown a strong correlation (R-2=0.98) between the predicted and observed values, with a standard deviation of 0.32 pH units across the complete pH range. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:2269 / 2274
页数:6
相关论文
共 26 条
[11]   Efficient fractionation and improved protein identification by peptide OFFGEL electrophoresis [J].
Hoerth, Patric ;
Miller, Christine A. ;
Preckel, Tobias ;
Wenz, Christian .
MOLECULAR & CELLULAR PROTEOMICS, 2006, 5 (10) :1968-1974
[12]  
Karatzoglou A, 2009, COMPUT STAT DATA ANA
[13]   AAindex: amino acid index database, progress report 2008 [J].
Kawashima, Shuichi ;
Pokarowski, Piotr ;
Pokarowska, Maria ;
Kolinski, Andrzej ;
Katayama, Toshiaki ;
Kanehisa, Minoru .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D202-D205
[14]   Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search [J].
Keller, A ;
Nesvizhskii, AI ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2002, 74 (20) :5383-5392
[15]   Building Predictive Models in R Using the caret Package [J].
Kuhn, Max .
JOURNAL OF STATISTICAL SOFTWARE, 2008, 28 (05) :1-26
[16]   Machine learning in bioinformatics [J].
Larranaga, Pedro ;
Calvo, Borja ;
Santana, Roberto ;
Bielza, Concha ;
Galdiano, Josu ;
Inza, Inaki ;
Lozano, Jose A. ;
Armananzas, Ruben ;
Santafe, Guzman ;
Perez, Aritz ;
Robles, Victor .
BRIEFINGS IN BIOINFORMATICS, 2006, 7 (01) :86-112
[17]   Prediction of the isoelectric point of an amino acid based on GA-PLS and SVMs [J].
Liu, HX ;
Zhang, RS ;
Yao, XJ ;
Liu, MC ;
Hu, ZD ;
Fan, BT .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (01) :161-167
[18]   Effect of training datasets on support vector machine prediction of protein-protein interactions [J].
Lo, SL ;
Cai, CZ ;
Chen, YZ ;
Chung, MCM .
PROTEOMICS, 2005, 5 (04) :876-884
[19]   In silico analysis of accurate proteomics, complemented by selective isolation of peptides [J].
Perez-Riverol, Yasset ;
Sanchez, Aniel ;
Ramos, Yassel ;
Schmidt, Alex ;
Mueller, Markus ;
Betancourt, Lazaro ;
Gonzalez, Luis J. ;
Vera, Roberto ;
Padron, Gabriel ;
Besada, Vladimir .
JOURNAL OF PROTEOMICS, 2011, 74 (10) :2071-2082
[20]   Determination of the isoelectric point of proteins by capillary isoelectric focusing [J].
Righetti, PG .
JOURNAL OF CHROMATOGRAPHY A, 2004, 1037 (1-2) :491-499