Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification

被引:129
作者
Mondal, Sukanta
Bhavna, Rajasekaran
Babu, Rajasekaran Mohan
Ramakumar, Suryanarayanarao [1 ]
机构
[1] Indian Inst Sci, Dept Phys, Bangalore 560012, Karnataka, India
[2] Indian Inst Sci, Bioinformat Ctr, Bangalore 560012, Karnataka, India
关键词
hypermutable mature conotoxin; superfamily classification; pseudo-amino acid composition; polarity index; support vector machines (SVMs);
D O I
10.1016/j.jtbi.2006.06.014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Conotoxins are disulfide rich small peptides that target a broad spectrum of ion-channels and neuronal receptors. They offer promising avenues in the treatment of chronic pain, epilepsy and cardiovascular diseases. Assignment of newly sequenced mature conotoxins into appropriate superfamilies using a computational approach could provide valuable preliminary information on the biological and pharmacological functions of the toxins. However, creation of protein sequence patterns for the reliable identification and classification of new conotoxin sequences may not be effective due to the hypervariability of mature toxins. With the aim of formulating an in silico approach for the classification of conotoxins into superfamilies, we have incorporated the concept of pseudo-amino acid composition to represent a peptide in a mathematical framework that includes the sequence-order effect along with conventional amino acid composition. The polarity index attribute, which encodes information such as residue surface buriability, polarity, and hydropathy, was used to store the sequence-order effect. Several methods like BLAST, ISort (Intimate Sorting) predictor, least Hamming distance algorithm, least Euclidean distance algorithm and multi-class support vector machines (SVMs), were explored for superfamily identification. The SVMs outperform other methods providing an overall accuracy of 88.1% for all correct predictions with generalized squared correlation of 0.75 using jackknife cross-validation test for A, M, O and T superfamilies and a negative set consisting of short cysteine rich sequences from different eukaryotes having diverse functions. The computed sensitivity and specificity for the superfamilies were found to be in the range of 84.0-94.1% and 80.0-95.5%, respectively, attesting to the efficacy of multi-class SVMs for the successful in silico, classification of the conotoxins into their superfamilies. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:252 / 260
页数:9
相关论文
共 40 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], 1999, SUPPORT VECTOR MACHI
  • [3] [Anonymous], P 21 INT C MACH LEAR
  • [4] Solving the protein sequence metric problem
    Atchley, WR
    Zhao, JP
    Fernandes, AD
    Drüke, T
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) : 6395 - 6400
  • [5] PRINTS and its automatic supplement, prePRINTS
    Attwood, TK
    Bradley, P
    Flower, DR
    Gaulton, A
    Maudling, N
    Mitchell, AL
    Moulton, G
    Nordle, A
    Paine, K
    Taylor, P
    Uddin, A
    Zygouri, C
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 400 - 402
  • [6] Swiss-Prot: Juggling between evolution and stability
    Bairoch, A
    Boeckmann, B
    Ferro, S
    Gasteiger, E
    [J]. BRIEFINGS IN BIOINFORMATICS, 2004, 5 (01) : 39 - 55
  • [7] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [8] Improved prediction of protein-protein binding sites using a support vector machines approach
    Bradford, JR
    Westhead, DR
    [J]. BIOINFORMATICS, 2005, 21 (08) : 1487 - 1494
  • [9] Characterization of D-amino-acid-containing excitatory conotoxins and redefinition of the I-conotoxin superfamily
    Buczek, O
    Yoshikami, D
    Watkins, M
    Bulaj, G
    Jimenez, EC
    Olivera, BM
    [J]. FEBS JOURNAL, 2005, 272 (16) : 4178 - 4188
  • [10] BURBIDGE R, 2000, P AISB 00 S ART INT, P1