Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms

被引:42
作者
Berry, EA
Dalby, AR
Yang, ZR
机构
[1] Univ Exeter, Sch Engn Comp Sci & Math, Dept Comp Sci, Exeter EX4 4PT, Devon, England
[2] Univ Exeter, Sch Biol & Chem Sci, Dept Biol Sci, Exeter EX4 4PT, Devon, England
关键词
protein phosphorylation; neural networks; genetic algorithm; pattern recognition;
D O I
10.1016/j.compbiolchem.2003.11.005
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Protein phosphorylation is a post-translational modification performed by a group of enzymes known as the protein kinases or phosphotransferases (Enzyme Commission classification 2.7). It is essential to the correct functioning of both proteins and cells, being involved with enzyme control, cell signalling and apoptosis. The major problem when attempting prediction of these sites is the broad substrate specificity of the enzymes. This study employs back-propagation neural networks (BPNNs), the decision tree algorithm C4.5 and the reduced bio-basis function neural network (rBBFNN) to predict phosphorylation sites. The aim is to compare prediction efficiency of the three algorithms for this problem, and examine knowledge extraction capability. All three algorithms are effective for phosphorylation site prediction. Results indicate that rBBFNN is the fastest and most sensitive of the algorithms. BPNN has the highest area under the ROC curve and is therefore the most robust, and C4.5 has the highest prediction accuracy. C4.5 also reveals the amino acid 2 residues upstream from the phosporylation site is important for serine/threonine phosphorylation, whilst the amino acid 3 residues upstream is important for tyrosine phosphorylation. (C) 2003 Elsevier Ltd. All rights reserved.
引用
收藏
页码:75 / 85
页数:11
相关论文
共 47 条
[1]  
[Anonymous], 1989, GENETIC ALGORITHM SE
[2]   Sequence and structure-based prediction of eukaryotic protein phosphorylation sites [J].
Blom, N ;
Gammeltoft, S ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 294 (05) :1351-1362
[3]  
Breiman L., 1998, CLASSIFICATION REGRE
[4]   Support vector machines for predicting the specificity of GaINAc-transferase [J].
Cai, YD ;
Liu, XJ ;
Xu, XB ;
Chou, KC .
PEPTIDES, 2002, 23 (01) :205-208
[5]   Artificial neural network model for predicting HIV protease cleavage sites in protein [J].
Cai, YD ;
Chou, KC .
ADVANCES IN ENGINEERING SOFTWARE, 1998, 29 (02) :119-128
[6]   Artificial neural network model for predicting protein subcellular location [J].
Cai, YD ;
Liu, XJ ;
Chou, KC .
COMPUTERS & CHEMISTRY, 2002, 26 (02) :179-182
[7]  
Duda R.O., 2001, Pattern Classification, V2nd
[8]   NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility [J].
Hansen, JE ;
Lund, O ;
Tolstrup, N ;
Gooley, AA ;
Williams, KL ;
Brunak, S .
GLYCOCONJUGATE JOURNAL, 1998, 15 (02) :115-130
[9]   PREDICTION OF O-GLYCOSYLATION OF MAMMALIAN PROTEINS - SPECIFICITY PATTERNS OF UDP-GALNAC-POLYPEPTIDE N-ACETYLGALACTOSAMINYLTRANSFERASE [J].
HANSEN, JE ;
LUND, O ;
ENGELBRECHT, J ;
BOHR, H ;
NIELSEN, JO ;
HANSEN, JES ;
BRUNAK, S .
BIOCHEMICAL JOURNAL, 1995, 308 :801-813
[10]  
Hart, 2006, PATTERN CLASSIFICATI