Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment

被引：62

作者：

Ashtawy, Hossam M. ^{[1
]}

Mahapatra, Nihar R. ^{[1
]}

机构：

[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA

来源：

JOURNAL OF CHEMICAL INFORMATION AND MODELING | 2018年 / 58卷 / 01期

基金：

美国国家科学基金会;

关键词：

PROTEIN; VALIDATION; DOCKING; TOOLS;

D O I：

10.1021/acs.jcim.7b00309

中图分类号：

R914 [药物化学];

学科分类号：

100701 ;

摘要：

Molecular docking, scoring, and virtual screening play an increasingly important role in computer-aided drug discovery. Scoring functions (SFs) are typically, employed to predict the binding conformation (docking task), binding affinity (scoring task), and binary activity level (screening task) of ligands against a critical protein target in a disease's pathway. In most molecular docking software packages available today, a generic binding affinity-based (BA-based) SF is invoiced for all three tasks to solve three different, but related; prediction problems. The limited predictive accuracies of such SFs in these three tasks has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we develop BT-Score, an ensemble machine-learning (ML) SF of boosted decision trees and thousands Of predictive descriptors to estimate BA. BT-Score reproduced BA of out-of-sample test complexes with correlation of 0.825. Even with this high accuracy in the scoring task, we demonstrate that the docking and screening performance of BT-Score and other BA-based SFs is far from ideal. This has motivated us to build two task-specific ML SFs for the docking and screening problems. We propose BT-Dock, a. boosted-tree ensemble model trained on a large-number of native and computer-generated ligand conformations and optimized to predict binding poses explicitly. This model has shown an average improvement of 25% over its BA-based counterparts in different ligand pose prediction scenarios. Similar improvement has also been obtained by our screening-based SF, BT-Screen, which directly models the ligand activity labeling task as a classification problem. BT-Screen is trained-on thousands of active and inactive protein-ligand complexes to optimize it for finding real actives from databases of ligands not seen in its training set. In addition to the three task-specific SFs, we propose a novel multi-task deep neural network (MT-Net) that is trained on data from the three tasks to simultaneously predict binding poses, affinities, and activity levels. We show that the performance of MT-Net is superior to conventional SFs and on a par with or better than models based on single task neural networks.

引用

页码：119 / 133

页数：15

共 44 条

[11] Improved protein-ligand binding affinity prediction by using a curvature-dependent surface-area model [J].

Cao, Yang ;

Li, Lei .

BIOINFORMATICS, 2014, 30 (12) :1674-1680

[12]

Chen T., 2015, R package version 0.4-2. 1 (4), P1

[13] Comparative Assessment of Scoring Functions on a Diverse Test Set [J].

Cheng, Tiejun ;

Li, Xun ;

Li, Yan ;

Liu, Zhihai ;

Wang, Renxiao .

JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (04) :1079-1093

[14]

Chollet F., 2015, about us

[15]

Dahl G. E., 2014, arXiv

[16] NNScore: A Neural-Network-Based Scoring Function for the Characterization of Protein-Ligand Complexes [J].

Durrant, Jacob D. ;

McCammon, J. Andrew .

JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (10) :1865-1871

[17]

Engels M F, 2001, Curr Opin Drug Discov Devel, V4, P275

[18] A decision-theoretic generalization of on-line learning and an application to boosting [J].

Freund, Y ;

Schapire, RE .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139

[19] Additive logistic regression: A statistical view of boosting - Rejoinder [J].

Friedman, J ;

Hastie, T ;

Tibshirani, R .

ANNALS OF STATISTICS, 2000, 28 (02) :400-407

[20] Greedy function approximation: A gradient boosting machine [J].

Friedman, JH .

ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232

← 1 2 3 4 5 →