Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks

被引:868
作者
Segler, Marwin H. S. [1 ,2 ]
Kogej, Thierry [3 ]
Tyrchan, Christian [4 ]
Waller, Mark P. [5 ,6 ]
机构
[1] Westfal Wilhelms Univ Munster, Inst Organ Chem, D-48149 Munster, Germany
[2] Westfal Wilhelms Univ Munster, Ctr Multiscale Theory & Computat, D-48149 Munster, Germany
[3] AstraZeneca R&D, Hit Discovery, Discovery Sci, Gothenburg, Sweden
[4] AstraZeneca R&D, IMED RIA, Dept Med Chem, Gothenburg, Sweden
[5] Shanghai Univ, Dept Phys, Shanghai, Peoples R China
[6] Shanghai Univ, Int Ctr Quantum & Mol Struct, Shanghai, Peoples R China
关键词
DE-NOVO DESIGN; DEVELOPMENT KIT CDK; SOURCE [!text type='JAVA']JAVA[!/text] LIBRARY; PREDICTION; CHEMOINFORMATICS; LANGUAGE; MODELS; SMILES;
D O I
10.1021/acscentsci.7b00512
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active toward a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery. GRAPHICS
引用
收藏
页码:120 / 131
页数:12
相关论文
共 75 条
[1]   Low Data Drug Discovery with One-Shot Learning [J].
Altae-Tran, Han ;
Ramsundar, Bharath ;
Pappu, Aneesh S. ;
Pande, Vijay .
ACS CENTRAL SCIENCE, 2017, 3 (04) :283-293
[2]   Ligand-Based Target Prediction with Signature Fingerprints [J].
Alvarsson, Jonathan ;
Eklund, Martin ;
Engkvist, Ola ;
Spjuth, Ola ;
Carlsson, Lars ;
Wikberg, Jarl E. S. ;
Noeske, Tobias .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) :2647-2653
[3]  
[Anonymous], 2015, International Conference on Learning Representations
[4]  
[Anonymous], 13080850 ARXIV
[5]  
[Anonymous], BIORXIV
[6]  
[Anonymous], 161104558 ARXIV
[7]  
[Anonymous], INT C LEARN REPR
[8]  
[Anonymous], 1997, Neural Computation
[9]  
[Anonymous], 161108307 ARXIV
[10]  
[Anonymous], 161002415 ARXIV