Disulfide connectivity prediction with 70% accuracy using two-level models

被引:19
作者
Chen, Bo-Juen
Tsai, Chi-Hung
Chan, Chen-hsiung
Kao, Cheng-Yan [1 ]
机构
[1] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, Taipei 106, Taiwan
[2] Inst Informat Ind, Taipei, Taiwan
关键词
disulfide connectivity; support vector machine; disulfide bonding pattern; hierarchical model;
D O I
10.1002/prot.20972
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Disulfide bridges stabilize protein structures covalently and play an important role in protein folding. Predicting disulfide connectivity precisely helps towards the solution of protein structure prediction. Previous methods for disulfide connectivity prediction either infer the bonding potential of cysteine pairs or rank alternative disulfide bonding patterns. As a result, these methods encode data according to cysteine pairs (pair-wise) or disulfide bonding patterns (pattern-wise). However, using either encoding scheme alone cannot fully utilize the local and global information of proteins, so the accuracies of previous methods are limited. In this work, we propose a novel two-level framework to predict disulfide connectivity. With this framework, both the pair-wise and pattern-wise encoding schemes are considered. Our models were validated on the datasets derived from SWISS-PROT 39 and 43, and the results demonstrate that our models can combine both local and global information. Compared to previous methods, significant improvements were obtained by our models. Our work may also provide insights to further improvements of disulfide connectivity prediction and increase its applicability in protein structure analysis and prediction.
引用
收藏
页码:246 / 252
页数:7
相关论文
共 23 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2004, Adv. Neural Inf. Process Syst
[3]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   Prediction of disulfide connectivity from protein sequences [J].
Chen, YC ;
Hwang, JK .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 (03) :507-512
[6]   Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences [J].
Chen, YC ;
Lin, SC ;
Lin, CJ ;
Hwang, JK .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 55 (04) :1036-1042
[7]  
Fariselli P, 1999, PROTEINS, V36, P340
[8]   Prediction of disulfide connectivity in proteins [J].
Fariselli, P ;
Casadio, R .
BIOINFORMATICS, 2001, 17 (10) :957-964
[9]  
FARISELLI P, 2002, NEURAL NETWORK BASED, P464
[10]   Disulfide connectivity prediction using secondary structure information and diresidue frequencies [J].
Ferrè, F ;
Clote, P .
BIOINFORMATICS, 2005, 21 (10) :2336-2346