SCScore: Synthetic Complexity Learned from a Reaction Corpus

被引:222
作者
Coley, Connor W. [1 ]
Rogers, Luke [1 ]
Green, William H. [1 ]
Jensen, Klavs F. [1 ]
机构
[1] MIT, Dept Chem Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
MOLECULAR COMPLEXITY; DRUG DISCOVERY; ACCESSIBILITY; RETROSYNTHESIS; INFORMATION; DEFINITION; PREDICTION; CHEMISTRY;
D O I
10.1021/acs.jcim.7b00622
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Several definitions of molecular complexity exist to facilitate prioritization of lead compounds, to identify diversity-inducing and complexifying reactions, and to guide retrosynthetic searches. In this work, we focus on synthetic complexity and reformalize its definition to correlate with the expected number of reaction steps required to produce a target molecule, with implicit knowledge about what compounds are reasonable starting materials. We train a neural network model on 12 million reactions from the Reaxys database to impose a pairwise inequality constraint enforcing the premise of this definition: that on average, the products of published chemical reactions should be more synthetically complex than their corresponding reactants. The learned metric (SCScore) synthetic complexity throughout a number exhibits highly desirable nonlinear behavior, particularly in recognizing increases in of linear synthetic routes.
引用
收藏
页码:252 / 261
页数:10
相关论文
共 39 条
[1]  
Abadi M., 2015, PREPRINT
[2]   Rapid evaluation of synthetic and molecular complexity for in silico chemistry [J].
Allu, TK ;
Oprea, TI .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (05) :1237-1243
[3]  
[Anonymous], RDKit: Open-source cheminformatics
[4]   Predicting synthetic accessibility: Application in drug discovery and development [J].
Baber, JC ;
Feher, M .
MINI-REVIEWS IN MEDICINAL CHEMISTRY, 2004, 4 (06) :681-692
[5]   A new and simple approach to chemical complexity. Application to the synthesis of natural products [J].
Barone, R ;
Chanon, M .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02) :269-272
[6]   THE 1ST GENERAL INDEX OF MOLECULAR COMPLEXITY [J].
BERTZ, SH .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1981, 103 (12) :3599-3601
[7]  
BERTZ SH, 1983, B MATH BIOL, V45, P849, DOI 10.1016/S0092-8240(83)80030-5
[8]   CONVERGENCE, MOLECULAR COMPLEXITY, AND SYNTHETIC ANALYSIS [J].
BERTZ, SH .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1982, 104 (21) :5801-5803
[9]   Hit and lead generation:: Beyond high-throughput screening [J].
Bleicher, KH ;
Böhm, HJ ;
Müller, K ;
Alanine, AI .
NATURE REVIEWS DRUG DISCOVERY, 2003, 2 (05) :369-378
[10]   Structure and reaction based evaluation of synthetic accessibility [J].
Boda, Krisztina ;
Seidel, Thomas ;
Gasteiger, Johann .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2007, 21 (06) :311-325