High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators

被引:72
作者
Friedrich, Nils-Ole [1 ]
Meyder, Agnes [1 ]
Kops, Christina de Bruyn [1 ]
Sommer, Kai [1 ]
Flachsenberg, Florian [1 ]
Rarey, Matthias [1 ]
Kirchmair, Johannes [1 ]
机构
[1] Univ Hamburg, ZBH Ctr Bioinformat, Martinistr 52,Bundesstrasse 43, D-20146 Hamburg, Germany
关键词
MOLECULAR-MECHANICS OPTIMIZATION; CAMBRIDGE STRUCTURAL DATABASE; ELECTRON-DENSITY; 3D CONFORMATION; FORCE-FIELD; DATA-BANK; TEST SET; DISTANCE-GEOMETRY; DRUG DISCOVERY; VALIDATION;
D O I
10.1021/acs.jcim.6b00613
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We developed a cheminformatics pipeline for the fully automated selection and extraction of high-quality protein-bound ligand conformations from X-ray structural data. The pipeline evaluates the validity and accuracy of the 3D structures of small molecules according to multiple criteria, including their fit to the electron density and their physicochemical and structural properties. Using this approach, we compiled two high-quality datasets from the Protein Data Bank (PDB): a comprehensive dataset and a diversified subset of 4626 and 2912 structures, respectively. The datasets were applied to benchmarking seven freely available conformer ensemble generators: Balloon (two different algorithms), the RDKit standard conformer ensemble generator, the Experimental-Torsion basic Knowledge Distance Geometry (ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK. Substantial differences in the performance of the individual algorithms were observed, with RDKit and ETKDG generally achieving a favorable balance of accuracy, ensemble size and runtime. The Platinum datasets are available for download from http://www.zbh.uni-hamburg.de/platinum_dataset.
引用
收藏
页码:529 / 539
页数:11
相关论文
共 63 条
[1]  
[Anonymous], 2010, Drug Discov Today Technol, V7, pe203, DOI 10.1016/j.ddtec.2010.10.003
[2]  
[Anonymous], 2016, MACR VERS 2016 3
[3]  
[Anonymous], MOL OP ENV MOE VERS
[4]  
[Anonymous], CONF VERS 3 9
[5]  
[Anonymous], RDKIT OPEN SOURCE CH
[6]  
[Anonymous], 2015, OPEN BABEL PACKAGE V
[7]   Bioactive conformational generation of small molecules: A comparative analysis between force-field and multiple empirical criteria based methods [J].
Bai, Fang ;
Liu, Xiaofeng ;
Li, Jiabo ;
Zhang, Haoyun ;
Jiang, Hualiang ;
Wang, Xicheng ;
Li, Honglin .
BMC BIOINFORMATICS, 2010, 11
[8]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[9]   Protoss: a holistic approach to predict tautomers and protonation states in protein-ligand complexes [J].
Bietz, Stefan ;
Urbaczek, Sascha ;
Schulz, Benjamin ;
Rarey, Matthias .
JOURNAL OF CHEMINFORMATICS, 2014, 6
[10]  
BLANEY JM, 1994, REV COMP CH, V5, P299, DOI 10.1002/9780470125823.ch6