PostDOCK: A structural, empirical approach to scoring protein ligand complexes

被引:38
作者
Springer, C [1 ]
Adalsteinsson, H [1 ]
Young, MM [1 ]
Kegelmeyer, PW [1 ]
Roe, DC [1 ]
机构
[1] Sandia Natl Labs, Livermore, CA 94551 USA
关键词
D O I
10.1021/jm0493360
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
In this work we introduce a postprocessing filter (PostDOCK) that distinguishes true binding ligand-protein complexes from docking artifacts (that are created by DOCK 4.0.1). PostDOCK is a pattern recognition system that relies on (1) a database of complexes, (2) biochemical descriptors of those complexes, and (3) machine learning tools. We use the protein databank (PDB) as the structural database of complexes and create diverse training and validation sets from it based on the "families of structurally similar proteins" (FSSP) hierarchy. For the biochemical descriptors, we consider terms from the DOCK score, empirical scoring, and buried solvent accessible surface area. For the machine-learners, we use a random forest classifier and logistic regression. Our results were obtained on a test set of 44 structurally diverse protein targets. Our highest performing descriptor combinations obtained similar to 19-fold enrichment (39 of 44 binding complexes were correctly identified, while only allowing 2 of 44 decoy complexes), and our best overall accuracy was 92%.
引用
收藏
页码:6821 / 6831
页数:11
相关论文
共 42 条
[1]  
[Anonymous], 1994, Modern applied statistics with S-Plus
[2]   A measure of the average cooperativity of a binding system [J].
Ben-Naim, A .
JOURNAL OF CHEMICAL PHYSICS, 1998, 109 (17) :7443-7449
[3]   Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations [J].
Bissantz, C ;
Folkers, G ;
Rognan, D .
JOURNAL OF MEDICINAL CHEMISTRY, 2000, 43 (25) :4759-4767
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Feature reduction for classification of multidimensional data [J].
Brunzell, H ;
Eriksson, J .
PATTERN RECOGNITION, 2000, 33 (10) :1741-1748
[7]   Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins [J].
Charifson, PS ;
Corkery, JJ ;
Murcko, MA ;
Walters, WP .
JOURNAL OF MEDICINAL CHEMISTRY, 1999, 42 (25) :5100-5109
[8]   POSSIBLE ORDERINGS IN MEASUREMENT SELECTION PROBLEM [J].
COVER, TM ;
VANCAMPENHOUT, JM .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1977, 7 (09) :657-661
[9]  
DUDA R, 2001, PATTERN CLASSFICIATI
[10]   SOLVATION ENERGY IN PROTEIN FOLDING AND BINDING [J].
EISENBERG, D ;
MCLACHLAN, AD .
NATURE, 1986, 319 (6050) :199-203