Classification of biomass reactions and predictions of reaction energies through machine learning

被引:7
作者
Chang, Chaoyi [1 ]
Medford, Andrew J. [1 ]
机构
[1] Georgia Inst Technol, Sch Chem & Biomol Engn, Atlanta, GA 30332 USA
关键词
GROUP ADDITIVITY; CATALYST DESIGN; GLYCEROL; OXIDATION; HYDRODEOXYGENATION; ADSORPTION; RULES;
D O I
10.1063/5.0014828
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Elementary steps and intermediate species of linearly structured biomass compounds are studied. Specifically, possible intermediates and elementary reactions of 15 key biomass compounds and 33 small molecules are obtained from a recursive bond-breaking algorithm. These are used as inputs to the unsupervised Mol2Vec algorithm to generate vector representations of all intermediates and elementary reactions. The vector descriptors are used to identify sub-classes of elementary steps, and linear discriminant analysis is used to accurately identify the reaction type and reduce the dimension of the vectors. The resulting descriptors are applied to predict gas-phase reaction energies using linear regression with accuracies that exceed the commonly employed group additivity approach. They are also applied to quantitatively assess model compound similarity, and the results are consistent with chemical intuition. This workflow for creating vector representations of complex molecular systems requires no input from electronic structure calculations, and it is expected to be applicable to other similar systems where vector representations are needed.
引用
收藏
页数:12
相关论文
共 56 条
[1]   Prediction of Transition-State Energies of Hydrodeoxygenation Reactions on Transition-Metal Surfaces Based on Machine Learning [J].
Abdelfatah, Kareem ;
Yang, Wenqiang ;
Solomon, Rajadurai Vijay ;
Rajbanshi, Biplab ;
Chowdhury, Asif ;
Zare, Mehdi ;
Kundu, Subrata Kumar ;
Yonge, Adam C. ;
Heyden, Andreas ;
Terejanu, Gabriel .
JOURNAL OF PHYSICAL CHEMISTRY C, 2019, 123 (49) :29804-29810
[2]   Predicting electron-phonon coupling constants of superconducting elements by machine learning [J].
Alizadeh, Z. ;
Mohammadizadeh, M. R. .
PHYSICA C-SUPERCONDUCTIVITY AND ITS APPLICATIONS, 2019, 558 :7-11
[3]   An object-oriented scripting interface to a legacy electronic structure code [J].
Bahn, SR ;
Jacobsen, KW .
COMPUTING IN SCIENCE & ENGINEERING, 2002, 4 (03) :56-66
[4]  
Balakrishnama S., 1998, Linear Discriminant AnalysisA Brief Tutorial, P1, DOI DOI 10.1073/PNAS.1715593115
[5]   ADDITIVITY RULES FOR ESTIMATION OF THERMOCHEMICAL PROPERTIES [J].
BENSON, SW ;
CRUICKSHANK, FR ;
GOLDEN, DM ;
HAUGEN, GR ;
ONEAL, HE ;
RODGERS, AS ;
SHAW, R ;
WALSH, R .
CHEMICAL REVIEWS, 1969, 69 (03) :279-+
[6]   Re-tooling Benson's group additivity method for estimation of the enthalpy of formation of free radicals: C/H and C/H/O groups [J].
Bhattacharya, Arijit ;
Shivalkar, Sagar .
JOURNAL OF CHEMICAL AND ENGINEERING DATA, 2006, 51 (04) :1169-1181
[7]   Technology development for the production of biobased products from biorefinery carbohydrates-the US Department of Energy's "Top 10" revisited [J].
Bozell, Joseph J. ;
Petersen, Gene R. .
GREEN CHEMISTRY, 2010, 12 (04) :539-554
[8]   Identification Schemes for Metal-Organic Frameworks To Enable Rapid Search and Cheminformatics Analysis [J].
Bucior, Benjamin J. ;
Rosen, Andrew S. ;
Haranczyk, Maciej ;
Yao, Zhenpeng ;
Ziebel, Michael E. ;
Farha, Omar K. ;
Hupp, Joseph T. ;
Siepmann, J. Ilja ;
Aspuru-Guzik, Alan ;
Snurr, Randall Q. .
CRYSTAL GROWTH & DESIGN, 2019, 19 (11) :6682-6697
[9]   Energy-based descriptors to rapidly predict hydrogen storage in metal-organic frameworks [J].
Bucior, Benjamin J. ;
Bobbitt, N. Scott ;
Islamoglu, Timur ;
Goswami, Subhadip ;
Gopalan, Arun ;
Yildirim, Taner ;
Farha, Omar K. ;
Bagheri, Neda ;
Snurr, Randall Q. .
MOLECULAR SYSTEMS DESIGN & ENGINEERING, 2019, 4 (01) :162-174
[10]   MEAN SHIFT, MODE SEEKING, AND CLUSTERING [J].
CHENG, YZ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (08) :790-799