Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect

被引:98
作者
Cai, YD
Liu, XJ
Xu, XB
Chou, KC
机构
[1] Chinese Acad Sci, Shanghai Res Ctr Biotechnol, Shanghai 200233, Peoples R China
[2] Univ Edinburgh, Inst Cell Anim & Populat Biol, Edinburgh EH9 3JT, Midlothian, Scotland
[3] Univ Wales Coll Cardiff, Dept Comp Sci, Cardiff CF2 3XF, S Glam, Wales
[4] Upjohn Co, Upjohn Labs, Comp Aided Drug Discovery, Kalamazoo, MI 49001 USA
关键词
Support Vector Machines; protein subcellular location; quasi-sequence-order-effect;
D O I
10.1002/jcb.10030
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Support Vector Machine (SVM), which is one class of learning machines, was applied to predict the subcellular location of proteins by incorporating the quasi-sequence-order effect (Chou [2000] Biochem. Biophys. Res. Commun. 278:477-483). In this study, the proteins are classified into the following 12 groups: (1) chloroplast, (2) cytoplasm, (3) cytoskeleton, (4) endoplasmic reticulum, (5) extracellular, (6) Golgi apparatus, (7) lysosome, (8) mitochondria, (9) nucleus, (10) peroxisome, (11) plasma membrane, and (12) vacuole, which account for most organelles and subcellular compartments in an animal or plant cell. Examinations for self-consistency and jackknife testing of the SVMs method were conducted for three sets consisting of 1,911, 2,044, and 2,191 proteins. The correct rates for self-consistency and the jackknife test values achieved with these protein sets were 94 and 83% for 1,911 proteins, 92 and 78% for 2,044 proteins, and 89 and 75% for 2,191 proteins, respectively. Furthermore, tests for correct prediction rates were undertaken with three independent testing datasets containing 2,148 proteins, 2,417 proteins, and 2,494 proteins producing values of 84, 77, and 74%, respectively. (C) 2001 Wiley-Liss, Inc.
引用
收藏
页码:343 / 348
页数:6
相关论文
共 22 条
  • [1] [Anonymous], 1999, INT C MACH LEARN ICM
  • [2] BURBIDGE R, 2000, P AISB 00 S ART INT, P1
  • [3] Is it a paradox or misinterpretation?
    Cai, YD
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 43 (03): : 336 - 338
  • [4] Relation between amino acid composition and cellular location of proteins
    Cedano, J
    Aloy, P
    PerezPons, JA
    Querol, E
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) : 594 - 600
  • [5] Using discriminant function for prediction of subcellular location of prokaryotic proteins
    Chou, KC
    Elrod, DW
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 1998, 252 (01) : 63 - 68
  • [6] Prediction of protein subcellular locations by incorporating quasi-sequence-order effect
    Chou, KC
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2000, 278 (02) : 477 - 483
  • [7] Chou KC, 1999, PROTEINS, V34, P137, DOI 10.1002/(SICI)1097-0134(19990101)34:1<137::AID-PROT11>3.0.CO
  • [8] 2-O
  • [9] Protein subcellular location prediction
    Chou, KC
    Elrod, DW
    [J]. PROTEIN ENGINEERING, 1999, 12 (02): : 107 - 118
  • [10] PREDICTION OF PROTEIN STRUCTURAL CLASSES
    CHOU, KC
    ZHANG, CT
    [J]. CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) : 275 - 349