N-Terminal myristoylation predictions by ensembles of neural networks

被引:171
作者
Bologna, G
Yvon, C
Duvaud, S
Veuthey, AL
机构
[1] Swiss Inst Bioinformat, CH-1211 Geneva 4, Switzerland
[2] Univ Bern, Inst Physiol, Bern, Switzerland
关键词
decision trees; myristoylation; neural networks;
D O I
10.1002/pmic.200300783
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
N-terminal myristoylation is a post-translational modification that causes the addition of a myristate to a glycine in the N-terminal end of the amino acid chain. This work presents neural network (NN) models that learn to discriminate myristoylated and nonmyristoylated proteins. Ensembles of 25 NNs and decision trees were trained on 390 positive sequences and 327 negative sequences. Experiments showed that NN ensembles were more accurate than decision tree ensembles. Our NN predictor evaluated by the leave-one-out procedure, obtained a false positive error rate equal to 2.1%. That was better than the PROSITE pattern for myristoylation for which the false positive error rate was 22.3%. On a recent version of Swiss-Prot (41.2), the NN ensemble predicted 876 myristoylated proteins, while 1150 proteins were predicted by the PROSITE pattern for myristoylation. Finally, compared to the well-known NMT predictor, the NN predictor gave similar results. Our tool is available under http://www.expasy.org/tools/myristoylator/myristoylator.html.
引用
收藏
页码:1626 / 1632
页数:7
相关论文
共 17 条
  • [1] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [2] An empirical comparison of voting classification algorithms: Bagging, boosting, and variants
    Bauer, E
    Kohavi, R
    [J]. MACHINE LEARNING, 1999, 36 (1-2) : 105 - 139
  • [3] Structure of N-myristoyltransferase with bound myristoylCoA and peptide substrate analogs
    Bhatnagar, RS
    Fütterer, K
    Farazi, TA
    Korolev, S
    Murray, CL
    Jackson-Machelski, E
    Gokel, GW
    Gordon, JI
    Waksman, G
    [J]. NATURE STRUCTURAL BIOLOGY, 1998, 5 (12) : 1091 - 1097
  • [4] The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
    Boeckmann, B
    Bairoch, A
    Apweiler, R
    Blatter, MC
    Estreicher, A
    Gasteiger, E
    Martin, MJ
    Michoud, K
    O'Donovan, C
    Phan, I
    Pilbout, S
    Schneider, M
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 365 - 370
  • [5] A model for single and multiple knowledge based networks
    Bologna, G
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2003, 28 (02) : 141 - 163
  • [6] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [7] Chandonia JM, 1999, PROTEINS, V35, P293
  • [8] CONDORCET JA, HIST ACAD ROYALE SCI, V1784, P31
  • [9] Structures of Saccharomyces cerevisiae N-myristoyltransferase with bound myristoylCoA and peptide provide insights about substrate recognition and catalysis
    Farazi, TA
    Waksman, G
    Gordon, JI
    [J]. BIOCHEMISTRY, 2001, 40 (21) : 6335 - 6343
  • [10] Haykin S., 1994, Neural networks: a comprehensive foundation