Benchmarking template selection and model quality assessment for high-resolution comparative modelling

被引:24
作者
Sadowski, M. I. [1 ]
Jones, D. T. [1 ]
机构
[1] UCL, Dept Comp Sci, Bioinformat Unit, London WC1E 6BT, England
关键词
protein structure prediction; homology modeling; bioinformatics; profile-profile alignment; high-resolution modeling MQAP;
D O I
10.1002/prot.21531
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Comparative modeling is presently the most accurate method of protein structure prediction. Previous experiments have shown the selection of the correct template to be of paramount importance to the quality of the final model. We have derived a set of 732 targets for which a choice of ten or more templates exist with 30-80% sequence identity and used this set to compare a number of possible methods for template selection: BLAST, PSI-BLAST, profile-profile alignment, HHpred HMM-HMM comparison, global sequence alignment, and the use of a model quality assessment program (MQAP). In addition, we have investigated the question of whether any structurally defined subset of the sequence could be used to predict template quality better than overall sequence similarity. We find that template selection by BLAST is sufficient in 75% of cases but that there are examples in which improvement (global RMSD 0.5 A or more) could be made. No significant improvement is found for any of the more sophisticated sequence-based methods of template selection at high sequence identities. A subset of 118 targets extending to the lowest levels of sequence similarity was examined and the HHpred and MQAP methods were found to improve ranking when available templates had 35-40% maximum sequence identity. Structurally defined subsets in general are found to be less discriminative than overall sequence similarity, with the coil residue subset performing equivalently to sequence similarity. Finally, we demonstrate that if models are built and model quality is assessed in combination with the sequence-template sequence similarity that a extra 7% of "best" models can be found. (C) 2007 Wiley-Liss, Inc.
引用
收藏
页码:476 / 485
页数:10
相关论文
共 54 条
[1]  
Alexandrov NN, 1998, PROTEIN SCI, V7, P254
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   QUASAR - scoring and ranking of sequence-structure alignments [J].
Birzele, F ;
Gewehr, JE ;
Zimmer, R .
BIOINFORMATICS, 2005, 21 (24) :4425-4426
[4]   The European Bioinformatics Institute's data resources [J].
Brooksbank, C ;
Camon, E ;
Harris, MA ;
Magrane, M ;
Martin, MJ ;
Mulder, N ;
O'Donovan, C ;
Parkinson, H ;
Tuli, MA ;
Apweiler, R ;
Birney, E ;
Brazma, A ;
Henrick, K ;
Lopez, R ;
Stoesser, G ;
Stoehr, P ;
Cameron, G .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :43-50
[5]  
Bujnicki JM, 2001, PROTEINS, P184
[6]   LiveBench-1: Continuous benchmarking of protein structure prediction servers [J].
Bujnicki, JM ;
Elofsson, A ;
Fischer, D ;
Rychlewski, L .
PROTEIN SCIENCE, 2001, 10 (02) :352-361
[7]   MoIlDE: a homology modeling framework you can click with [J].
Canutescu, AA ;
Dunbrack, RL .
BIOINFORMATICS, 2005, 21 (12) :2914-2916
[8]   A graph-theory algorithm for rapid protein side-chain prediction [J].
Canutescu, AA ;
Shelenkov, AA ;
Dunbrack, RL .
PROTEIN SCIENCE, 2003, 12 (09) :2001-2014
[9]   THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS [J].
CHOTHIA, C ;
LESK, AM .
EMBO JOURNAL, 1986, 5 (04) :823-826
[10]   In silico protein recombination:: Enhancing template and sequence alignment selection for comparative protein modelling [J].
Contreras-Moreira, B ;
Fitzjohn, PW ;
Bates, PA .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 328 (03) :593-608