Improving the quality of protein structure models by selecting from alignment alternatives

被引:16
作者
Sommer, Ingolf
Toppo, Stefano
Sander, Oliver
Lengauer, Thomas
Tosatto, Silvio C. E.
机构
[1] Max Planck Inst Informat, Dept Compuatat Biol & Appl Algorithm, D-66123 Saarbrucken, Germany
[2] Univ Padua, Dept Biol Chem, I-35121 Padua, Italy
[3] Univ Padua, Dept Biol, I-35131 Padua, Italy
[4] Univ Padua, CRIBI Biotechnol Ctr, I-35131 Padua, Italy
关键词
D O I
10.1186/1471-2105-7-364
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In the area of protein structure prediction, recently a lot of effort has gone into the development of Model Quality Assessment Programs (MQAPs). MQAPs distinguish high quality protein structure models from inferior models. Here, we propose a new method to use an MQAP to improve the quality of models. With a given target sequence and template structure, we construct a number of different alignments and corresponding models for the sequence. The quality of these models is scored with an MQAP and used to choose the most promising model. An SVM-based selection scheme is suggested for combining MQAP partial potentials, in order to optimize for improved model selection. Results: The approach has been tested on a representative set of proteins. The ability of the method to improve models was validated by comparing the MQAP-selected structures to the native structures with the model quality evaluation program TM-score. Using the SVM-based model selection, a significant increase in model quality is obtained (as shown with a Wilcoxon signed rank test yielding p-values below 10(-15)). The average increase in TMscore is 0.016, the maximum observed increase in TM-score is 0.29. Conclusion: In template-based protein structure prediction alignment is known to be a bottleneck limiting the overall model quality. Here we show that a combination of systematic alignment variation and modern model scoring functions can significantly improve the quality of alignment-based models.
引用
收藏
页数:11
相关论文
共 33 条
[1]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[2]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[3]   A graph-theory algorithm for rapid protein side-chain prediction [J].
Canutescu, AA ;
Shelenkov, AA ;
Dunbrack, RL .
PROTEIN SCIENCE, 2003, 12 (09) :2001-2014
[4]   The ASTRAL Compendium in 2004 [J].
Chandonia, JM ;
Hon, G ;
Walker, NS ;
Lo Conte, L ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D189-D192
[5]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[6]   Prediction of CASP6 structures using automated Robetta protocols [J].
Chivian, D ;
Kim, DE ;
Malmström, L ;
Schonbrun, J ;
Rohl, CA ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 :157-166
[7]   In silico protein recombination:: Enhancing template and sequence alignment selection for comparative protein modelling [J].
Contreras-Moreira, B ;
Fitzjohn, PW ;
Bates, PA .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 328 (03) :593-608
[8]  
Dimitriadou E., 2005, E1071 PACKAGE
[9]   Servers for protein structure prediction [J].
Fischer, D .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2006, 16 (02) :178-182
[10]   In search for more accurate alignments in the twilight zone [J].
Jaroszewski, L ;
Li, WZ ;
Godzik, A .
PROTEIN SCIENCE, 2002, 11 (07) :1702-1713