Application of Generative Autoencoder in De Novo Molecular Design

被引:262
作者
Blaschke, Thomas [1 ,2 ]
Olivecrona, Marcus [1 ]
Engkvist, Ola [1 ]
Bajorath, Jurgen [2 ]
Chen, Hongming [1 ]
机构
[1] AstraZeneca R&D Gothenburg, Hit Discovery, Discovery Sci, Innovat Med & Early Dev Biotech Unit, S-43183 Molndal, Sweden
[2] Univ Bonn, Bonn Aachen Int Ctr Informat Technol BIT, Life Sci Informat, Dahlmannstr 2, D-53113 Bonn, Germany
基金
欧盟地平线“2020”;
关键词
Autoencoder; chemoinformatics; de novo molecular design; deep learning; inverse QSAR;
D O I
10.1002/minf.201700123
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A major challenge in computational chemistry is the generation of novel molecular structures with desirable pharmacological and physiochemical properties. In this work, we investigate the potential use of autoencoder, a deep learning methodology, for de novo molecular design. Various generative autoencoders were used to map molecule structures into a continuous latent space and vice versa and their performance as structure generator was assessed. Our results show that the latent space preserves chemical similarity principle and thus can be used for the generation of analogue structures. Furthermore, the latent space created by autoencoders were searched systematically to generate novel compounds with predicted activity against dopamine receptor type2 and compounds similar to known active compounds not included in the trainings set were identified.
引用
收藏
页数:11
相关论文
共 33 条
[1]   Principal component analysis [J].
Abdi, Herve ;
Williams, Lynne J. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04) :433-459
[2]  
[Anonymous], MOL PHARM
[3]  
[Anonymous], LECT NOTES COMPUTATI, DOI DOI 10.1007/978-3-540-73750-6_2
[4]  
[Anonymous], 28 C NEUR INF PROC S
[5]  
Bickerton GR, 2012, NAT CHEM, V4, P90, DOI [10.1038/NCHEM.1243, 10.1038/nchem.1243]
[6]  
Chung J., 2014, ARXIV
[7]   The signature molecular descriptor - 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides [J].
Churchwell, CJ ;
Rintoul, MD ;
Martin, S ;
Visco, DP ;
Kotu, A ;
Larson, RS ;
Sillerud, LO ;
Brown, DC ;
Faulon, JL .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2004, 22 (04) :263-273
[8]   Neural networks: Accurate nonlinear QSAR model for HEPT derivatives [J].
Douali, L ;
Villemin, D ;
Cherqaoui, D .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (04) :1200-1207
[9]  
Duvenaud D.K., 2015, Advances in neural information processing Systems (NIPS), P2224, DOI DOI 10.48550/ARXIV.1509.09292
[10]   ChEMBL: a large-scale bioactivity database for drug discovery [J].
Gaulton, Anna ;
Bellis, Louisa J. ;
Bento, A. Patricia ;
Chambers, Jon ;
Davies, Mark ;
Hersey, Anne ;
Light, Yvonne ;
McGlinchey, Shaun ;
Michalovich, David ;
Al-Lazikani, Bissan ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D1100-D1107