Statistical potentials for fold assessment

被引:284
作者
Melo, F [1 ]
Sánchez, R [1 ]
Sali, A [1 ]
机构
[1] Rockefeller Univ, Pels Family Ctr Biochem & Struct Biol, Labs Mol Biophys, New York, NY 10021 USA
关键词
model evaluation; comparative modeling; fold assignment; fold assessment; statistical potentials; large scale protein structure modeling;
D O I
10.1002/pro.110430
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue-level statistical potential were optimized, including distance-dependent, contact, Phi/Psi dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Z-score of the model energy. The performance of a Z-score was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distance-dependent and accessible surface potentials. The distance-dependent potential that is optimal for assessing models of all sizes uses both C-alpha and C-beta atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 Angstrom, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that is optimal for assessing models of all sizes uses C-beta atoms as interaction centers and distinguishes between all 20 standard residue types. The performance of the tested statistical potentials is not likely to improve significantly with an increase in the number of known protein structures used in their derivation. The parameters of fold assessment whose optimal values vary significantly with model size include the size of the known protein structures used to derive the potential and the distance range of the accessible surface potential. Fold assessment by statistical potentials is most difficult for the very small models. This difficulty presents a challenge to fold assessment in large-scale comparative modeling, which produces many small and incomplete models. The results described in this study provide a basis for an optimal use of statistical potentials in fold assessment.
引用
收藏
页码:430 / 448
页数:19
相关论文
共 127 条
[121]  
Vorobjev YN, 1998, PROTEINS, V32, P399, DOI 10.1002/(SICI)1097-0134(19980901)32:4<399::AID-PROT1>3.3.CO
[122]  
2-H
[123]   WHAT IF - A MOLECULAR MODELING AND DRUG DESIGN PROGRAM [J].
VRIEND, G .
JOURNAL OF MOLECULAR GRAPHICS, 1990, 8 (01) :52-&
[124]  
Wang J, 1999, NAT STRUCT BIOL, V6, P1033
[125]   Free energy calculations on dimer stability of the HIV protease using molecular dynamics and a continuum solvent model [J].
Wang, W ;
Kollman, PA .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 303 (04) :567-582
[126]   Ab initio construction of protein tertiary structures using a hierarchical approach [J].
Xia, Y ;
Huang, ES ;
Levitt, M ;
Samudrala, R .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 300 (01) :171-185
[127]   Solvent models for protein-ligand binding: Comparison of implicit solvent Poisson and surface generalized born models with explicit solvent simulations [J].
Zhang, LY ;
Gallicchio, E ;
Friesner, RA ;
Levy, RM .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2001, 22 (06) :591-607