Estimation of aqueous solubility of organic compounds with QSPR approach

被引:64
作者
Gao, H [1 ]
Shanmugasundaram, V [1 ]
Lee, P [1 ]
机构
[1] Pharmacia, Comp Aided Drug Discovery, Kalamazoo, MI 49007 USA
关键词
solubility; Genetic algorithm; diversity; principal component regression; AquaSol database;
D O I
10.1023/A:1015103914543
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Purpose. To derive a QSPR model for estimation of aqueous solubility of organic compounds. Methods. Solubility data for 930 diverse compounds was investigated with principal component regression analysis. This set of compounds consists of pharmaceuticals, pollutants, nutrients, herbicides, and pesticides. The diversity of this collection was analyzed using MACCS fingerprint and BCUT chemistry space. Results. The training set of the solubility data is as diverse as the Available Chemicals Directory, and more diverse than the MDL Drug Data Report. Forty-six molecular descriptors were screened using a genetic algorithm. A QSPR model with a squared correlation coefficient (r(2)) of 0.92, a root mean square error of 0.53 log molar solubility (log S-w), an average absolute estimation error of 0.36 log S-w, and a cross-validated q(2) of 0.91 was derived. The QSPR model was validated with a test set of 249 compounds not included in the training set. The absolute estimation error for the test set of compounds was 0.39 log S-w. Conclusions. A highly predictive QSPR model for estimating aqueous solubility was derived and validated. This model can be used to estimate aqueous solubility for virtual screening and combinatorial library design.
引用
收藏
页码:497 / 503
页数:7
相关论文
共 28 条
[1]   P-glycoprotein, secretory transport, and other barriers to the oral delivery of anti-HIV drugs [J].
Aungst, BJ .
ADVANCED DRUG DELIVERY REVIEWS, 1999, 39 (1-3) :105-116
[2]   A NEW METHOD FOR THE ESTIMATION OF THE AQUEOUS SOLUBILITY OF ORGANIC-COMPOUNDS [J].
BODOR, N ;
HUANG, MJ .
JOURNAL OF PHARMACEUTICAL SCIENCES, 1992, 81 (09) :954-960
[3]  
*CHEM COMP GROUPS, 200002 MOE CHEM COMP
[4]   Rapid calculation of polar molecular surface area and its application to the prediction of transport phenomena. 2. Prediction of blood-brain barrier penetration [J].
Clark, DE .
JOURNAL OF PHARMACEUTICAL SCIENCES, 1999, 88 (08) :815-821
[5]   Water solubility, vapor pressure, and activity coefficients of terpenes and terpenoids [J].
Fichan, I ;
Larroche, C ;
Gros, JB .
JOURNAL OF CHEMICAL AND ENGINEERING DATA, 1999, 44 (01) :56-62
[6]   Application of BCUT metrics and genetic algorithm in binary QSAR analysis [J].
Gao, H .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02) :402-407
[7]   Comparative QSAR analysis of estrogen receptor ligands [J].
Gao, H ;
Katzenellenbogen, JA ;
Garg, R ;
Hansch, C .
CHEMICAL REVIEWS, 1999, 99 (03) :723-744
[8]   A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases [J].
Ghose, AK ;
Viswanadhan, VN ;
Wendoloski, JJ .
JOURNAL OF COMBINATORIAL CHEMISTRY, 1999, 1 (01) :55-68
[9]   LINEAR FREE-ENERGY RELATIONSHIP BETWEEN PARTITION COEFFICIENTS AND AQUEOUS SOLUBILITY OF ORGANIC LIQUIDS [J].
HANSCH, C ;
QUINLAN, JE ;
LAWRENCE, GL .
JOURNAL OF ORGANIC CHEMISTRY, 1968, 33 (01) :347-+
[10]   Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology [J].
Huuskonen, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (03) :773-777