Prediction of Organic Reaction Outcomes Using Machine Learning

被引:482
作者
Coley, Connor W. [1 ]
Barzilay, Regina [2 ]
Jaakkola, Tommi S. [2 ]
Green, William H. [1 ]
Jensen, Klays F. [1 ]
机构
[1] MIT, Dept Chem Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
ALGORITHM; LEVEL;
D O I
10.1021/acscentsci.7b00064
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules' overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank <= 3 in 86.7% of cases, and rank <= 5 in 90.8% of cases.
引用
收藏
页码:434 / 443
页数:10
相关论文
共 31 条
  • [1] [Anonymous], REACT
  • [2] [Anonymous], 1977, ACS Symposium Series, DOI DOI 10.1021/BK-1977-0061.CH001
  • [3] [Anonymous], REV COMPUT MOL SCI
  • [4] [Anonymous], SOFTWARE DEV CHEM
  • [5] [Anonymous], 2012, ARXIV12115590
  • [6] [Anonymous], 2012, COMPUTER ENCE
  • [7] Route Design in the 21st Century: The ICSYNTH Software Tool as an Idea Generator for Synthesis Prediction
    Bogevig, Anders
    Federsel, Hans-Juergen
    Huerta, Fernando
    Hutchings, Michael G.
    Kraut, Hans
    Langer, Thomas
    Loew, Peter
    Oppawsky, Christoph
    Rein, Tobias
    Saller, Heinz
    [J]. ORGANIC PROCESS RESEARCH & DEVELOPMENT, 2015, 19 (02) : 357 - 368
  • [8] Mining Electronic Laboratory Notebooks: Analysis, Retrosynthesis, and Reaction Based Enumeration
    Christ, Clara D.
    Zentgraf, Matthias
    Kriegl, Jan M.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2012, 52 (07) : 1745 - 1756
  • [9] COREY E. J., 1967, PURE APPL CHEM, V14, P19, DOI 10.1351/pac196714010019
  • [10] COMPUTER-ASSISTED DESIGN OF COMPLEX ORGANIC SYNTHESES
    COREY, EJ
    WIPKE, WT
    [J]. SCIENCE, 1969, 166 (3902) : 178 - &