Learning Retrosynthetic Planning through Simulated Experience

被引:107
作者
Schreck, John S. [1 ]
Coley, Connor W. [2 ]
Bishop, Kyle J. M. [1 ]
机构
[1] Columbia Univ, Dept Chem Engn, New York, NY 10027 USA
[2] MIT, Dept Chem Engn, Cambridge, MA 02139 USA
关键词
COMPUTER; PREDICTION; COMPLEXITY; NETWORK; DESIGN; TOOL;
D O I
10.1021/acscentsci.9b00055
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The problem of retrosynthetic planning can be framed as a one-player game, in which the chemist (or a computer program) works backward from a molecular target to simpler starting materials through a series of choices regarding which reactions to perform. This game is challenging as the combinatorial space of possible choices is astronomical, and the value of each choice remains uncertain until the synthesis plan is completed and its cost evaluated. Here, we address this search problem using deep reinforcement learning to identify policies that make (near) optimal reaction choices during each step of retrosynthetic planning according to a user-defined cost metric. Using a simulated experience, we train a neural network to estimate the expected synthesis cost or value of any given molecule based on a representation of its molecular structure. We show that learned policies based on this value network can outperform a heuristic approach that favors symmetric disconnections when synthesizing unfamiliar molecules from available starting materials using the fewest number of reactions. We discuss how the learned policies described here can be incorporated into existing synthesis planning tools and how they can be adapted to changes in the synthesis cost objective or material availability.
引用
收藏
页码:970 / 981
页数:12
相关论文
共 39 条
  • [1] [Anonymous], ARXIV181102633
  • [2] THE 1ST GENERAL INDEX OF MOLECULAR COMPLEXITY
    BERTZ, SH
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1981, 103 (12) : 3599 - 3601
  • [3] Route Design in the 21st Century: The ICSYNTH Software Tool as an Idea Generator for Synthesis Prediction
    Bogevig, Anders
    Federsel, Hans-Juergen
    Huerta, Fernando
    Hutchings, Michael G.
    Kraut, Hans
    Langer, Thomas
    Loew, Peter
    Oppawsky, Christoph
    Rein, Tobias
    Saller, Heinz
    [J]. ORGANIC PROCESS RESEARCH & DEVELOPMENT, 2015, 19 (02) : 357 - 368
  • [4] Unsupervised data base clustering based on Daylight's fingerprint and Tanimoto similarity: A fast and automated way to cluster small and large data sets
    Butina, D
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (04): : 747 - 750
  • [5] Mining Electronic Laboratory Notebooks: Analysis, Retrosynthesis, and Reaction Based Enumeration
    Christ, Clara D.
    Zentgraf, Matthias
    Kriegl, Jan M.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2012, 52 (07) : 1745 - 1756
  • [6] Coley C. W., RDCHIRAL
  • [7] Coley C. W., RETROTEMP
  • [8] A graph-convolutional neural network model for the prediction of chemical reactivity
    Coley, Connor W.
    Jin, Wengong
    Rogers, Luke
    Jamison, Timothy F.
    Jaakkola, Tommi S.
    Green, William H.
    Barzilay, Regina
    Jensen, Klavs F.
    [J]. CHEMICAL SCIENCE, 2019, 10 (02) : 370 - 377
  • [9] Machine Learning in Computer-Aided Synthesis Planning
    Coley, Connor W.
    Green, William H.
    Jensen, Klays F.
    [J]. ACCOUNTS OF CHEMICAL RESEARCH, 2018, 51 (05) : 1281 - 1289
  • [10] SCScore: Synthetic Complexity Learned from a Reaction Corpus
    Coley, Connor W.
    Rogers, Luke
    Green, William H.
    Jensen, Klavs F.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (02) : 252 - 261