Machine learning of chemical reactivity from databases of organic reactions

被引:25
作者
Carrera, Goncalo V. S. M. [1 ]
Gupta, Sunil [1 ]
Aires-de-Sousa, Joao [1 ]
机构
[1] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, REQUIMTE,CQFB, P-2829516 Caparica, Portugal
关键词
MOLMAP; Chemical reactivity; Databases; Machine learning; Electrophilicity; GENOME-SCALE CLASSIFICATION; TROPOSPHERIC DEGRADATION; SKIN SENSITIZATION; METABOLIC REACTIONS; RATE CONSTANTS; RANDOM FOREST; IN-VITRO; PREDICTION; QSAR; ASSIGNMENT;
D O I
10.1007/s10822-009-9275-2
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Databases of chemical reactions contain knowledge about the reactivity of specific reagents. Although information is in general only explicitly available for compounds reported to react, it is possible to derive information about substructures that do not react in the reported reactions. Both types of information (positive and negative) can be used to train machine learning techniques to predict if a compound reacts or not with a specific reagent. The whole process was implemented with two databases of reactions, one involving BuNH2 as the reagent, and the other NaCNBH3. Negative information was derived using MOLMAP molecular descriptors, and classification models were developed with Random Forests also based on MOLMAP descriptors. MOLMAP descriptors were based exclusively on calculated physicochemical features of molecules. Correct predictions were achieved for similar to 90% of independent test sets. While NaCNBH3 is a selective reducing reagent widely used in organic synthesis, BuNH2 is a nucleophile that mimics the reactivity of the lysine side chain (involved in an initiating step of the mechanism leading to skin sensitization).
引用
收藏
页码:419 / 429
页数:11
相关论文
共 35 条
[1]  
[Anonymous], SELF ORG ASS MEMORY
[2]   Non-enzymatic glutathione reactivity and in vitro toxicity: A non-animal approach to skin sensitization [J].
Aptula, AO ;
Patlewicz, G ;
Roberts, DW ;
Schultz, TW .
TOXICOLOGY IN VITRO, 2006, 20 (02) :239-247
[3]   Skin sensitization: Reaction mechanistic applicability domains for structure-activity relationships [J].
Aptula, AO ;
Patlewicz, G ;
Roberts, DW .
CHEMICAL RESEARCH IN TOXICOLOGY, 2005, 18 (09) :1420-1426
[4]  
ATKINSON R, 1988, ENVIRON TOXICOL CHEM, V7, P435
[5]   Structure-activity relationship studies of chemical mutagens and carcinogens: Mechanistic investigations and prediction approaches [J].
Benigni, R .
CHEMICAL REVIEWS, 2005, 105 (05) :1767-1800
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
BREIMAN L, 2004, FORTRAN
[8]   Ester hydrolysis rate constant prediction from quantum topological molecular similarity descriptors [J].
Chaudry, UA ;
Popelier, PLA .
JOURNAL OF PHYSICAL CHEMISTRY A, 2003, 107 (22) :4578-4582
[9]  
Clayden J., 2001, ORGANIC CHEM
[10]   Prediction of ozone tropospheric degradation rate constant of organic compounds by using artificial neural networks [J].
Fatemi, MH .
ANALYTICA CHIMICA ACTA, 2006, 556 (02) :355-363