Prediction of the conformation and geometry of loops in globular proteins: Testing ArchDB, a structural classification of loops

被引:15
作者
Fernandez-Fuentes, N
Querol, E
Aviles, FX
Sternberg, MJE
Oliva, B
机构
[1] Univ Pompeu Fabra, Struct Bioinformat Grp, GRIB, IMIM, Catalonia 08003, Spain
[2] Univ Autonoma Barcelona, Inst Biomed & Biotechnol, Bellaterra, Spain
[3] Univ London Imperial Coll Sci & Technol, Struct Bioinformat Grp, Dept Biol Sci, London, England
关键词
loop structure prediction; fold recognition; comparative modeling; sequence profiles;
D O I
10.1002/prot.20516
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI-BLAST. A jack-knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X-ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI-PRED (Jones, J Mol Biol 1999;292:195-202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations.
引用
收藏
页码:746 / 757
页数:12
相关论文
共 47 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Protein structure prediction and structural genomics [J].
Baker, D ;
Sali, A .
SCIENCE, 2001, 294 (5540) :93-96
[3]   PREDICTION OF THE FOLDING OF SHORT POLYPEPTIDE SEGMENTS BY UNIFORM CONFORMATIONAL SAMPLING [J].
BRUCCOLERI, RE ;
KARPLUS, M .
BIOPOLYMERS, 1987, 26 (01) :137-168
[4]   Improved protein loop prediction from sequence alone [J].
Burke, DF ;
Deane, CM .
PROTEIN ENGINEERING, 2001, 14 (07) :473-478
[5]   Browsing the SLoop database of structurally classified loops connecting elements of protein secondary structure [J].
Burke, DF ;
Deane, CM ;
Blundell, TL .
BIOINFORMATICS, 2000, 16 (06) :513-519
[6]   ASTRAL compendium enhancements [J].
Chandonia, JM ;
Walker, NS ;
Conte, LL ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :260-263
[7]   CANONICAL STRUCTURES FOR THE HYPERVARIABLE REGIONS OF IMMUNOGLOBULINS [J].
CHOTHIA, C ;
LESK, AM .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (04) :901-917
[8]   CODA: A combined algorithm for predicting the structurally variable regions of protein models [J].
Deane, CM ;
Blundell, TL .
PROTEIN SCIENCE, 2001, 10 (03) :599-612
[9]   Structure-based evaluation of sequence comparison and fold recognition alignment accuracy [J].
Domingues, FS ;
Lackner, P ;
Andreeva, A ;
Sippl, MJ .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (04) :1003-1013
[10]   Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update [J].
Du, PC ;
Andrec, M ;
Levy, RM .
PROTEIN ENGINEERING, 2003, 16 (06) :407-414