A study of quality measures for protein threading models

被引:169
作者
Cristobal, Susana [1 ]
Zemla, Adam [2 ]
Fischer, Daniel [3 ]
Rychlewski, Leszek [4 ]
Elofsson, Arne [5 ]
机构
[1] BMC Uppsala Univ, Cell & Mol Biol Dept, SE-75124 Uppsala, Sweden
[2] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
[3] Ben Gurion Univ Negev, Dept Bioinformat Comp Sci, IL-84015 Beer Sheva, Israel
[4] Int Inst Mol & Cell Biol, PL-02109 Warsaw, Poland
[5] Stockholm Univ, Stockholm Bioinformat Ctr, SE-10691 Stockholm, Sweden
基金
瑞典研究理事会;
关键词
D O I
10.1186/1471-2105-2-5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Prediction of protein structures is one of the fundamental challenges in biology today. To fully understand how well different prediction methods perform, it is necessary to use measures that evaluate their performance. Every two years, starting in 1994, the CASP (Critical Assessment of protein Structure Prediction) process has been organized to evaluate the ability of different predictors to blindly predict the structure of proteins. To capture different features of the models, several measures have been developed during the CASP processes. However, these measures have not been examined in detail before. In an attempt to develop fully automatic measures that can be used in CASP, as well as in other type of benchmarking experiments, we have compared twenty-one measures. These measures include the measures used in CASP3 and CASP2 as well as have measures introduced later. We have studied their ability to distinguish between the better and worse models submitted to CASP3 and the correlation between them. Results: Using a small set of 1340 models for 23 different targets we show that most methods correlate with each other. Most pairs of measures show a correlation coefficient of about 0.5. The correlation is slightly higher for measures of similar types. We found that a significant problem when developing automatic measures is how to deal with proteins of different length. Also the comparisons between different measures is complicated as many measures are dependent on the size of the target. We show that the manual assessment can be reproduced to about 70% using automatic measures. Alignment independent measures, detects slightly more of the models with the correct fold, while alignment dependent measures agree better when selecting the best models for each target. Finally we show that using automatic measures would, to a large extent, reproduce the assessors ranking of the predictors at CASP3. Conclusions: We show that given a sufficient number of targets the manual and automatic measures would have given almost identical results at CASP3. If the intent is to reproduce the type of scoring done by the manual assessor in in CASP3, the best approach might be to use a combination of alignment independent and alignment dependent measures, as used in several recent studies.
引用
收藏
页数:15
相关论文
共 32 条