Benchmarking consensus model quality assessment for protein fold recognition

被引:55
作者
McGuffin, Liam J. [1 ]
机构
[1] Univ Reading, Sch Biol Sci, Reading RG6 6AS, Berks, England
关键词
D O I
10.1186/1471-2105-8-345
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Selecting the highest quality 3D model of a protein structure from a number of alternatives remains an important challenge in the field of structural bioinformatics. Many Model Quality Assessment Programs (MQAPs) have been developed which adopt various strategies in order to tackle this problem, ranging from the so called "true" MQAPs capable of producing a single energy score based on a single model, to methods which rely on structural comparisons of multiple models or additional information from meta-servers. However, it is clear that no current method can separate the highest accuracy models from the lowest consistently. In this paper, a number of the top performing MQAP methods are benchmarked in the context of the potential value that they add to protein fold recognition. Two novel methods are also described: ModSSEA, which based on the alignment of predicted secondary structure elements and ModFOLD which combines several true MQAP methods using an artificial neural network. Results: The ModSSEA method is found to be an effective model quality assessment program for ranking multiple models from many servers, however further accuracy can be gained by using the consensus approach of ModFOLD. The ModFOLD method is shown to significantly outperform the true MQAPs tested and is competitive with methods which make use of clustering or additional information from multiple servers. Several of the true MQAPs are also shown to add value to most individual fold recognition servers by improving model selection, when applied as a post filter in order to re-rank models. Conclusion: MQAPs should be benchmarked appropriately for the practical context in which they are intended to be used. Clustering based methods are the top performing MQAPs where many models are available from many servers; however, they often do not add value to individual fold recognition servers when limited models are available. Conversely, the true MQAP methods tested can often be used as effective post filters for re-ranking few models from individual fold recognition servers and further improvements can be achieved using a consensus of these methods.
引用
收藏
页数:15
相关论文
共 27 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   VERIFY3D: Assessment of protein models with three-dimensional profiles [J].
Eisenberg, D ;
Luthy, R ;
Bowie, JU .
MACROMOLECULAR CRYSTALLOGRAPHY, PT B, 1997, 277 :396-404
[3]   A composite score for predicting errors in protein structure models [J].
Eramian, David ;
Shen, Min-Yi ;
Devos, Damien ;
Melo, Francisco ;
Sali, Andrej ;
Marti-Renom, Marc A. .
PROTEIN SCIENCE, 2006, 15 (07) :1653-1666
[4]   Servers for protein structure prediction [J].
Fischer, D .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2006, 16 (02) :178-182
[5]   3D-Jury: a simple approach to improve protein structure predictions [J].
Ginalski, K ;
Elofsson, A ;
Fischer, D ;
Rychlewski, L .
BIOINFORMATICS, 2003, 19 (08) :1015-1018
[6]   Errors in protein structures [J].
Hooft, RWW ;
Vriend, G ;
Sander, C ;
Abola, EE .
NATURE, 1996, 381 (6580) :272-272
[7]   Prediction of novel and analogous folds using fragment assembly and fold recognition [J].
Jones, DT ;
Bryson, K ;
Coleman, A ;
McGuffin, LJ ;
Sadowski, MI ;
Sodhi, JS ;
Ward, JJ .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 :143-151
[8]  
JONES RG, 1992, HUMAN RESOURCE MANAG, V2, P195
[9]   DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].
KABSCH, W ;
SANDER, C .
BIOPOLYMERS, 1983, 22 (12) :2577-2637
[10]   A novel approach to decoy set generation: Designing a physical energy function having local minima with native structure characteristics [J].
Keasar, C ;
Levitt, M .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 329 (01) :159-174