Planning chemical syntheses with deep neural networks and symbolic AI

被引:1183
作者
Segler, Marwin H. S. [1 ,2 ,3 ]
Preuss, Mike [4 ]
Waller, Mark P. [5 ,6 ]
机构
[1] Westfalische Wilhelms Univ, Inst Organ Chem, Munster, Germany
[2] Westfalische Wilhelms Univ, Ctr Multiscale Theory & Computat, Munster, Germany
[3] BenevolentAI, London, England
[4] Westfalische Wilhelms Univ Munster, European Res Ctr Informat Syst, Munster, Germany
[5] Shanghai Univ, Dept Phys, Shanghai, Peoples R China
[6] Shanghai Univ, Int Ctr Quantum & Mol Struct, Shanghai, Peoples R China
关键词
ORGANIC-CHEMISTRY; KNOWLEDGE-BASE; SYSTEM; CLASSIFICATION; PREDICTION; REACTIVITY; DESIGN; ROUTE; RETROSYNTHESIS; DISCOVERY;
D O I
10.1038/nature25978
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
To plan the syntheses of small organic molecules, chemists use retrosynthesis, a problem-solving technique in which target molecules are recursively transformed into increasingly simpler precursors. Computer-aided retrosynthesis would be a valuable tool but at present it is slow and provides results of unsatisfactory quality. Here we use Monte Carlo tree search and symbolic artificial intelligence (AI) to discover retrosynthetic routes. We combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on essentially all reactions ever published in organic chemistry. Our system solves for almost twice as many molecules, thirty times faster than the traditional computer-aided search method, which is based on extracted rules and hand-designed heuristics. In a double-blind AB test, chemists on average considered our computer-generated routes to be equivalent to reported literature routes.
引用
收藏
页码:604 / +
页数:16
相关论文
共 76 条
  • [1] Andersen Jakob L., 2014, International Journal of Computational Biology and Drug Design, V7, P225, DOI 10.1504/IJCBDD.2014.061649
  • [2] [Anonymous], 2016, THEAN PYTH FRAM FAST
  • [3] [Anonymous], 2017, NIPS
  • [4] [Anonymous], RDKit: Open-source cheminformatics
  • [5] Structure and reaction based evaluation of synthetic accessibility
    Boda, Krisztina
    Seidel, Thomas
    Gasteiger, Johann
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2007, 21 (06) : 311 - 325
  • [6] Route Design in the 21st Century: The ICSYNTH Software Tool as an Idea Generator for Synthesis Prediction
    Bogevig, Anders
    Federsel, Hans-Juergen
    Huerta, Fernando
    Hutchings, Michael G.
    Kraut, Hans
    Langer, Thomas
    Loew, Peter
    Oppawsky, Christoph
    Rein, Tobias
    Saller, Heinz
    [J]. ORGANIC PROCESS RESEARCH & DEVELOPMENT, 2015, 19 (02) : 357 - 368
  • [7] A Survey of Monte Carlo Tree Search Methods
    Browne, Cameron B.
    Powley, Edward
    Whitehouse, Daniel
    Lucas, Simon M.
    Cowling, Peter I.
    Rohlfshagen, Philipp
    Tavener, Stephen
    Perez, Diego
    Samothrakis, Spyridon
    Colton, Simon
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (01) : 1 - 43
  • [8] Bruckner R, 2014, REAKTIONSMECHANISMEN
  • [9] Unsupervised data base clustering based on Daylight's fingerprint and Tanimoto similarity: A fast and automated way to cluster small and large data sets
    Butina, D
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (04): : 747 - 750
  • [10] Machine learning of chemical reactivity from databases of organic reactions
    Carrera, Goncalo V. S. M.
    Gupta, Sunil
    Aires-de-Sousa, Joao
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2009, 23 (07) : 419 - 429