Prediction of aqueous solubility for a diverse set of heteroatom-containing organic compounds using a quantitative structure-property relationship

被引:74
作者
Sutter, JM [1 ]
Jurs, PC [1 ]
机构
[1] PENN STATE UNIV,DEPT CHEM,UNIVERSITY PK,PA 16802
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1996年 / 36卷 / 01期
关键词
D O I
10.1021/ci9501507
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The primary goal of a quantitative structure-property relationship (QSPR) is to identify a set of structurally based numerical descriptors that can be mathematically linked to a property of interest. The types of descriptors fall into three categories: topological, electronic, and geometric. In this study, 140 organic compounds with diverse structures were split into a training set, a cross-validation set, and a prediction set. The training set was used to build multiple linear regression and computational neural network models, the cross-validation set was used to prevent overtraining of the neural network, and the prediction set was used to validate the mathematical models. A set of nine descriptors was found that effectively linked the aqueous solubility to each structure. However, the polychlorinated biphenyls (PCBs) had a large root-mean-square (rms) error associated with them. Therefore models were also built using a training set that contained no PCBs. A set of nine descriptors was found with a significant improvement of the rms error of the training set as well as the prediction set.
引用
收藏
页码:100 / 107
页数:8
相关论文
共 33 条