Prediction of mitochondrial proteins based on genetic algorithm - partial least squares and support vector machine

被引:41
作者
Tan, F. [1 ]
Feng, X. [1 ]
Fang, Z. [1 ]
Li, M. [1 ]
Guo, Y. [1 ]
Jiang, L. [1 ]
机构
[1] Sichuan Univ, Coll Chem, Chengdu 610064, Peoples R China
关键词
mitochondrial proteins; dipeptide composition; genetic algorithm-partial least square; support vector machine;
D O I
10.1007/s00726-006-0465-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Mitochondria are essential cell organelles of eukaryotes. Hence, it is vitally important to develop an automated and reliable method for timely identification of novel mitochondrial proteins. In this study, mitochondrial proteins were encoded by dipeptide composition technology; then, the genetic algorithm-partial least square (GA-PLS) method was used to evaluate the dipeptide composition elements which are more important in recognizing mitochondrial proteins; further, these selected dipeptide composition elements were applied to support vector machine (SVM)-based classifiers to predict the mitochondrial proteins. All the models were trained and validated by the jackknife cross-validation test. The prediction accuracy is 85%, suggesting that it performs reasonably well in predicting the mitochondrial proteins. Our results strongly imply that not all the dipeptide compositions are informative and indispensable for predicting proteins. The source code of MATLAB and the dataset are available on request under liml@scu.edu.cn.
引用
收藏
页码:669 / 675
页数:7
相关论文
共 73 条
[1]   Adaptation of protein surfaces to subcellular location [J].
Andrade, MA ;
O'Donoghue, SI ;
Rost, B .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (02) :517-525
[2]   MitoP2, an integrated database on mitochondrial proteins in yeast and man [J].
Andreoli, C ;
Prokisch, H ;
Hörtnagel, K ;
Mueller, JC ;
Münsterkötter, M ;
Scharfe, C ;
Meitinger, T .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D459-D462
[3]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[4]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[5]   Predicting enzyme subclass by functional domain composition and pseudo amino acid composition [J].
Cai, YD ;
Chou, KC .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (03) :967-971
[6]   Support vector machines for predicting membrane protein types by using functional domain composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
BIOPHYSICAL JOURNAL, 2003, 84 (05) :3257-3263
[7]   Computational identification of human mitochondrial proteins based on homology to yeast mitochondrially targeted proteins [J].
Cameron, JM ;
Hurd, T ;
Robinson, BH .
BIOINFORMATICS, 2005, 21 (09) :1825-1830
[8]   Using discriminant function for prediction of subcellular location of prokaryotic proteins [J].
Chou, KC ;
Elrod, DW .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 1998, 252 (01) :63-68
[9]   Prediction of protein subcellular locations by incorporating quasi-sequence-order effect [J].
Chou, KC .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2000, 278 (02) :477-483
[10]   Predicting protein-protein interactions from sequences in a hybridization space [J].
Chou, KC ;
Cai, YD .
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (02) :316-322