Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets

被引:84
作者
Clark, Alex M. [1 ]
Dole, Krishna [2 ]
Coulon-Spektor, Anna [2 ]
McNutt, Andrew [2 ]
Grass, George [3 ]
Freundlich, Joel S. [4 ,5 ]
Reynolds, Robert C. [6 ]
Ekins, Sean [2 ,7 ]
机构
[1] Mol Mat Informat Inc, Montreal, PQ H3J 2S1, Canada
[2] Collaborat Drug Discovery, Burlingame, CA 94010 USA
[3] G2 Res Inc, Tahoe City, CA 96145 USA
[4] Rutgers State Univ, Ctr Emerging & Reemerging Pathogens, Div Infect Dis, Dept Med,New Jersey Med Sch, Newark, NJ 07103 USA
[5] Rutgers State Univ, Dept Pharmacol Physiol, New Jersey Med Sch, Newark, NJ 07103 USA
[6] Univ Alabama Birmingham, Dept Chem, Coll Arts & Sci, Birmingham, AL 35294 USA
[7] Collaborat Chem, Fuquay Varina, NC 27526 USA
基金
美国国家卫生研究院;
关键词
BRAIN-BARRIER PERMEABILITY; THROUGHPUT SCREENING DATA; IN-SILICO PHARMACOLOGY; QSAR MODELS; LIPOPHILICITY DETERMINATION; APPLICABILITY DOMAIN; COMPUTATIONAL MODELS; METABOLIC STABILITY; DISTRIBUTION VALUES; PREDICTIVE MODELS;
D O I
10.1021/acs.jcim.5b00143
中图分类号
R914 [药物化学];
学科分类号
100705 [微生物与生化药学];
摘要
On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user's own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery.
引用
收藏
页码:1231 / 1245
页数:15
相关论文
共 149 条
[1]
High-throughput screening for inhibitors of Mycobacterium tuberculosis H37Rv [J].
Ananthan, Subramaniam ;
Faaleolea, Ellen R. ;
Goldman, Robert C. ;
Hobrath, Judith V. ;
Kwong, Cecil D. ;
Laughon, Barbara E. ;
Maddry, Joseph A. ;
Mehta, Alka ;
Rasmussen, Lynn ;
Reynolds, Robert C. ;
Secrist, John A., III ;
Shindo, Nice ;
Showe, Dustin N. ;
Sosa, Melinda I. ;
Suling, William J. ;
White, E. Lucile .
TUBERCULOSIS, 2009, 89 (05) :334-353
[2]
[Anonymous], CHEM BIOINFORMATICS
[3]
[Anonymous], CHEMBL NTD
[4]
Measurement of baseline toxicity and QSAR analysis of 50 non-polar and 58 polar narcotic chemicals for the alga Pseudokirchneriella subcapitata [J].
Aruoja, Villem ;
Moosus, Maikki ;
Kahru, Anne ;
Sihtmaee, Mariliis ;
Maran, Uko .
CHEMOSPHERE, 2014, 96 :23-32
[5]
Balakin Konstantin V, 2005, Curr Drug Discov Technol, V2, P99, DOI 10.2174/1570163054064666
[6]
Kohonen maps for prediction of binding to human cytochrome P450 3A4 [J].
Balakin, KV ;
Ekins, S ;
Bugrim, A ;
Ivanenkov, YA ;
Korolev, D ;
Nikolsky, YV ;
Skorenko, AV ;
Ivashchenko, AA ;
Savchuk, NP ;
Nikolskaya, T .
DRUG METABOLISM AND DISPOSITION, 2004, 32 (10) :1183-1189
[7]
Quantitative structure-metabolism relationship modeling of metabolic N-dealkylation reaction rates [J].
Balakin, KV ;
Ekins, S ;
Bugrim, A ;
Ivanenkov, YA ;
Korolev, D ;
Nikolsky, YV ;
Ivashchenko, AA ;
Savchuk, NP ;
Nikolskaya, T .
DRUG METABOLISM AND DISPOSITION, 2004, 32 (10) :1111-1120
[8]
Strategy of utilizing in vitro and in vivo ADME tools for lead optimization and drug candidate selection [J].
Balani, SK ;
Miwa, GT ;
Gan, LS ;
Wu, JT ;
Lee, FW .
CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2005, 5 (11) :1033-1038
[9]
KNIME-CDK: Workflow-driven cheminformatics [J].
Beisken, Stephan ;
Meinl, Thorsten ;
Wiswedel, Bernd ;
de Figueiredo, Luis F. ;
Berthold, Michael ;
Steinbeck, Christoph .
BMC BIOINFORMATICS, 2013, 14
[10]
The ChEMBL bioactivity database: an update [J].
Bento, A. Patricia ;
Gaulton, Anna ;
Hersey, Anne ;
Bellis, Louisa J. ;
Chambers, Jon ;
Davies, Mark ;
Krueger, Felix A. ;
Light, Yvonne ;
Mak, Lora ;
McGlinchey, Shaun ;
Nowotka, Michal ;
Papadatos, George ;
Santos, Rita ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D1083-D1090