ProVal: A protein-scoring function for the selection of native and near-native folds

被引:5
作者
Berglund, A
Head, RD
Welsh, EA
Marshall, GR
机构
[1] Washington Univ, Sch Med, Ctr Computat Biol, St Louis, MO 63110 USA
[2] Pfizer Corp, Computat Biol Grp, St Louis, MO USA
关键词
protein folding; empirical scoring function; structure prediction; partial least squares; PLS;
D O I
10.1002/prot.10523
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A low-resolution scoring function for the selection of native and near-native structures from a set of predicted structures for a given protein sequence has been developed. The scoring function, ProVal (Protein Validate), used several variables that describe an aspect of protein structure for which the proximity to the native structure can be assessed quantitatively. Among the parameters included are a packing estimate, surface areas, and the contact order. A partial least squares for latent variables (PLS) model was built for each candidate set of the 28 decoy sets of structures generated for 22 different proteins using the described parameters as independent variables. The C-alpha RMS of the candidate structures versus the experimental structure was used as the dependent variable. The final generalized scoring function was an average of all models derived, ensuring that the function was not optimized for specific fold classes or method of structure generation of the candidate folds. The results show that the crystal structure was scored best in 64% of the 28 test sets and was clearly separated from the decoys in many examples. In all the other cases in which the crystal structure did not rank first, it ranked within the top 10%. Thus, although ProVal could not distinguish between predicted structures that were similar overall in fold quality due to its inherently low resolution, it can clearly be used as a primary filter to eliminate similar to90% of fold candidates generated by current prediction methods from all-atom modeling and further evaluation. The correlation between the predicted and actual C-alpha RMS values varies considerably between the candidate fold sets.
引用
收藏
页码:289 / 302
页数:14
相关论文
共 45 条
[1]   Protein data bank archives of three-dimensional macromolecular structures [J].
Abola, EE ;
Sussman, JL ;
Prilusky, J ;
Manning, NO .
MACROMOLECULAR CRYSTALLOGRAPHY, PT B, 1997, 277 :556-571
[2]  
Bahar I, 1997, PROTEINS, V29, P292, DOI 10.1002/(SICI)1097-0134(199711)29:3<292::AID-PROT4>3.0.CO
[3]  
2-D
[4]  
BREUER M, ESTAR ELECTROSTATIC
[5]   Folding protein models with a simple hydrophobic energy function:: The fundamental importance of monomer inside/outside segregation [J].
de Araújo, AFP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (22) :12482-12487
[6]  
Dudek MJ, 1998, J COMPUT CHEM, V19, P548, DOI 10.1002/(SICI)1096-987X(19980415)19:5<548::AID-JCC7>3.0.CO
[7]  
2-M
[8]  
Eyrich VA, 1999, PROTEINS, V35, P41, DOI 10.1002/(SICI)1097-0134(19990401)35:1<41::AID-PROT5>3.3.CO
[9]  
2-E
[10]   Prediction of protein tertiary structure to low resolution: Performance for a large and structurally diverse test set [J].
Eyrich, VA ;
Standley, DM ;
Friesner, RA .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 288 (04) :725-742