Prediction of α-turns in proteins using PSI-BLAST profiles and secondary structure information

被引:42
作者
Kaur, H [1 ]
Raghava, GPS [1 ]
机构
[1] Inst Microbial Technol, Bioinformat Ctr, Chandigarh, India
关键词
neural networks; multiple alignment; tight turns; web server; weka; pebls;
D O I
10.1002/prot.10569
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this paper a systematic attempt has been made to develop a better method for predicting a-turns in proteins. Most of the commonly used approaches in the field of protein structure prediction have been tried in this study, which includes statistical approach "Sequence Coupled Model" and machine learning approaches; i) artificial neural network (ANN); ii) Weka (Waikato Environment for Knowledge Analysis) Classifiers and iii) Parallel Exemplar Based Learning (PEBLS). We have also used multiple sequence alignment obtained from PSIBLAST and secondary structure information predicted by PSIPRED. The training and testing of all methods has been performed on a data set of 193 non-homologous protein X-ray structures using five-fold cross-validation. It has been observed that ANN with multiple sequence alignment and predicted secondary structure information outperforms other methods. Based on our observations we have developed an ANN-based method for predicting a-turns in proteins. The main components of the method are two feed-forward backpropagation networks with a single hidden layer. The first sequence-structure network is trained with the multiple sequence alignment in the form of PSI-BLAST-generated position specific scoring matrices. The initial predictions obtained from the first network and PSIPRED predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. The final network yields an overall prediction accuracy of 78.0% and MCC of 0.16. A web server AlphaPred (http://www.imtech.res.in/raghava/alphapred/) has been developed based on this approach. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:83 / 90
页数:8
相关论文
共 30 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   REFINEMENT OF HUMAN LYSOZYME AT 1.5 A RESOLUTION ANALYSIS OF NONBONDED AND HYDROGEN-BOND INTERACTIONS [J].
ARTYMIUK, PJ ;
BLAKE, CCF .
JOURNAL OF MOLECULAR BIOLOGY, 1981, 152 (04) :737-762
[4]  
Chou KC, 1997, BIOPOLYMERS, V42, P837, DOI 10.1002/(SICI)1097-0282(199712)42:7<837::AID-BIP9>3.0.CO
[5]  
2-U
[6]   Prediction of tight turns and their types in proteins [J].
Chou, KC .
ANALYTICAL BIOCHEMISTRY, 2000, 286 (01) :1-16
[7]   A WEIGHTED NEAREST NEIGHBOR ALGORITHM FOR LEARNING WITH SYMBOLIC FEATURES [J].
COST, S ;
SALZBERG, S .
MACHINE LEARNING, 1993, 10 (01) :57-78
[8]  
DELEO JM, 1993, P 2 INT S UNC MOD AN, P318
[9]   BIOACTIVE PEPTIDES - X-RAY AND NMR CONFORMATIONAL STUDY OF [AIB(5,6)-D-ALA(8)]CYCLOLINOPEPTIDE-A [J].
DIBLASIO, B ;
ROSSI, F ;
BENEDETTI, E ;
PAVONE, V ;
SAVIANO, M ;
PEDONE, C ;
ZANOTTI, G ;
TANCREDI, T .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1992, 114 (21) :8277-8283
[10]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130