Predicting reaction performance in C-N cross-coupling using machine learning

被引:513
作者
Ahneman, Derek T. [1 ]
Estrada, Jesus G. [1 ]
Lin, Shishi [2 ]
Dreher, Spencer D. [2 ]
Doyle, Abigail G. [1 ]
机构
[1] Princeton Univ, Dept Chem, Princeton, NJ 08544 USA
[2] Merck Sharp & Dohme Corp, Chem Capabil & Screening, Kenilworth, NJ 07033 USA
关键词
DISCOVERY; TOOL;
D O I
10.1126/science.aar5169
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Machine learning methods are becoming integral to scientific inquiry in numerous disciplines. We demonstrated that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation. We created scripts to compute and extract atomic, molecular, and vibrational descriptors for the components of a palladium-catalyzed Buchwald-Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of various potentially inhibitory additives. Using these descriptors as inputs and reaction yield as output, we showed that a random forest algorithm provides significantly improved predictive performance over linear regression analysis. The random forest model was also successfully applied to sparse training sets and out-of-sample prediction, suggesting its value in facilitating adoption of synthetic methodology.
引用
收藏
页码:186 / 190
页数:5
相关论文
共 35 条
  • [1] Designer substrate library for quantitative, predictive modeling of reaction performance
    Bess, Elizabeth N.
    Bischoff, Amanda J.
    Sigman, Matthew S.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (41) : 14698 - 14703
  • [2] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [3] Prediction of Organic Reaction Outcomes Using Machine Learning
    Coley, Connor W.
    Barzilay, Regina
    Jaakkola, Tommi S.
    Green, William H.
    Jensen, Klays F.
    [J]. ACS CENTRAL SCIENCE, 2017, 3 (05) : 434 - 443
  • [4] Intermolecular Reaction Screening as a Tool for Reaction Evaluation
    Collins, Karl D.
    Glorius, Frank
    [J]. ACCOUNTS OF CHEMICAL RESEARCH, 2015, 48 (03) : 619 - 627
  • [5] Collins KD, 2014, NAT CHEM, V6, P859, DOI [10.1038/NCHEM.2062, 10.1038/nchem.2062]
  • [6] Activity cliffs in drug discovery: Dr Jekyll or Mr Hyde?
    Cruz-Monteagudo, Maykel
    Medina-Francos, Jose L.
    Perez-Castillo, Yunierkis
    Nicolotti, Orazio
    Cordeiro, M. Natalia D. S.
    Borges, Fernanda
    [J]. DRUG DISCOVERY TODAY, 2014, 19 (08) : 1069 - 1080
  • [7] A Systematic Investigation of Quaternary Ammonium Ions as Asymmetric Phase-Transfer Catalysts. Application of Quantitative Structure Activity/Selectivity Relationships
    Denmark, Scott E.
    Gould, Nathan D.
    Wolf, Larry M.
    [J]. JOURNAL OF ORGANIC CHEMISTRY, 2011, 76 (11) : 4337 - 4357
  • [8] Draper N. R., 1998, Applied regression analysis, DOI DOI 10.1002/9781118625590.CH15
  • [9] Ligand-Free-Palladium-Catalyzed Direct 4-Arylation of Isoxazoles Using Aryl Bromides
    Fall, Yacoub
    Reynaud, Celine
    Doucet, Henri
    Santelli, Maurice
    [J]. EUROPEAN JOURNAL OF ORGANIC CHEMISTRY, 2009, 2009 (24) : 4041 - 4050
  • [10] The effect of structure upon the reactions of organic compounds benzene derivatives
    Hammett, LP
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1937, 59 : 96 - 103