Designing and evaluating the MULTICOM protein local and global model quality prediction methods in the CASP10 experiment

被引:35
作者
Cao, Renzhi [1 ]
Wang, Zheng [4 ]
Cheng, Jianlin [1 ,2 ,3 ]
机构
[1] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Inst Informat, Columbia, MO 65211 USA
[3] Univ Missouri, Christopher S Bond Life Sci Ctr, Columbia, MO 65211 USA
[4] Univ So Mississippi, Sch Comp, Hattiesburg, MS 39406 USA
基金
美国国家科学基金会;
关键词
Protein model quality assessment; Protein model quality assurance program; Protein structure prediction; Support vector machine; Clustering; SINGLE; MUFOLD;
D O I
10.1186/1472-6807-14-13
中图分类号
Q6 [生物物理学];
学科分类号
071011 [生物物理学];
摘要
Background: Protein model quality assessment is an essential component of generating and using protein structural models. During the Tenth Critical Assessment of Techniques for Protein Structure Prediction (CASP10), we developed and tested four automated methods (MULTICOM-REFINE, MULTICOM-CLUSTER, MULTICOM-NOVEL, and MULTICOM-CONSTRUCT) that predicted both local and global quality of protein structural models. Results: MULTICOM-REFINE was a clustering approach that used the average pairwise structural similarity between models to measure the global quality and the average Euclidean distance between a model and several top ranked models to measure the local quality. MULTICOM-CLUSTER and MULTICOM-NOVEL were two new support vector machine-based methods of predicting both the local and global quality of a single protein model. MULTICOM-CONSTRUCT was a new weighted pairwise model comparison (clustering) method that used the weighted average similarity between models in a pool to measure the global model quality. Our experiments showed that the pairwise model assessment methods worked better when a large portion of models in the pool were of good quality, whereas single-model quality assessment methods performed better on some hard targets when only a small portion of models in the pool were of reasonable quality. Conclusions: Since digging out a few good models from a large pool of low-quality models is a major challenge in protein structure prediction, single model quality assessment methods appear to be poised to make important contributions to protein structure modeling. The other interesting finding was that single-model quality assessment scores could be used to weight the models by the consensus pairwise model comparison method to improve its accuracy.
引用
收藏
页数:12
相关论文
共 38 条
[1]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]
[Anonymous], PROTEIN STRUCT DIST
[3]
[Anonymous], ENCY COMPUT CHEM
[4]
SCRATCH: a protein structure and structural feature prediction server [J].
Cheng, J ;
Randall, AZ ;
Sweredoski, MJ ;
Baldi, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W72-W76
[5]
RECURSIVE PROTEIN MODELING: A DIVIDE AND CONQUER STRATEGY FOR PROTEIN STRUCTURE PREDICTION AND ITS CASE STUDY IN CASP9 [J].
Cheng, Jianlin ;
Eickholt, Jesse ;
Wang, Zheng ;
Deng, Xin .
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2012, 10 (03)
[6]
A multi-template combination algorithm for protein comparative modeling [J].
Cheng, Jianlin .
BMC STRUCTURAL BIOLOGY, 2008, 8
[7]
EISENHABER F, 1995, CRIT REV BIOCH MOL B, V30
[8]
Assembling novel protein folds from super-secondary structural fragments [J].
Jones, DT ;
McGuffin, LJ .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :480-485
[9]
DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].
KABSCH, W ;
SANDER, C .
BIOPOLYMERS, 1983, 22 (12) :2577-2637
[10]
Evaluation of 3D-Jury on CASP7 models [J].
Kajan, Laszlo ;
Rychlewski, Leszek .
BMC BIOINFORMATICS, 2007, 8 (1)