Search for predictive generic model of aqueous solubility using Bayesian neural nets

被引:100
作者
Bruneau, P [1 ]
机构
[1] AstraZeneca Ctr Rech, F-51689 Reims, France
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2001年 / 41卷 / 06期
关键词
D O I
10.1021/ci010363y
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Several predictive models of aqueous solubility have been published. They have good performances on the data sets which have been used for training the models, but usually these data sets do not contain many structures similar to the structures of interest to the drug research and their applicability in drug hunting is questionable. A very diverse data set has been gathered with compounds issued from literature reports and proprietary compounds. These compounds have been grouped in a so-called literature data set I, a proprietary data set II, and a mixed data set III formed by I and II. About 100 descriptors emphasizing surface properties were calculated for every compound. Bayesian learning of neural nets which cumulates the advantages of neural nets without having their weaknesses was used to select the most parsimonious models and train them, from I, II, and III. The models were established by either selecting the most efficient descriptors one by one using a modified Gram-Schmidt procedure (GS) or by simplifying a most complete model using automatic relevance procedure (ARD). The predictive ability of the models was accessed using validation data sets as much unrelated to the training sets as possible, using two new parameters: NDD(x,ref) the normalized smallest descriptor distance of a compound x to a reference data set and CD(x,mod) the combination of NDD(x,ref) with the dispersion of the Bayesian neural nets calculations. The results show that it is possible to obtain a generic predictive model from database I but that the diversity of database II is too restricted to give a model with good generalization ability and that the ARD method applied to the mixed database III gives the best predictive model.
引用
收藏
页码:1605 / 1616
页数:12
相关论文
共 59 条
[1]   The correlation and prediction of the solubility of compounds in water using an amended solvation energy relationship [J].
Abraham, MH ;
Le, J .
JOURNAL OF PHARMACEUTICAL SCIENCES, 1999, 88 (09) :868-880
[2]   Can we learn to distinguish between "drug-like" and "nondrug-like" molecules? [J].
Ajay ;
Walters, WP ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1998, 41 (18) :3314-3324
[3]  
ANDEA T, 1991, J MED CHEM, V34, P2824
[4]   Properties of new orthogonal graph theoretical invariants in structure-property correlations [J].
Araujo, O ;
Morales, DA .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (06) :1031-1037
[5]  
BARONI M, 1993, QUANT STRUCT-ACT REL, V12, P7550
[6]   NEURAL NETWORK STUDIES .1. ESTIMATION OF THE AQUEOUS SOLUBILITY OF ORGANIC-COMPOUNDS [J].
BODOR, N ;
HARGET, A ;
HUANG, MJ .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1991, 113 (25) :9480-9483
[7]   A NEW METHOD FOR THE ESTIMATION OF THE AQUEOUS SOLUBILITY OF ORGANIC-COMPOUNDS [J].
BODOR, N ;
HUANG, MJ .
JOURNAL OF PHARMACEUTICAL SCIENCES, 1992, 81 (09) :954-960
[8]   New QSAR methods applied to structure-activity mapping and combinatorial chemistry [J].
Burden, FR ;
Winkler, DA .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (02) :236-242
[9]   Robust QSAR models using Bayesian regularized neural networks [J].
Burden, FR ;
Winkler, DA .
JOURNAL OF MEDICINAL CHEMISTRY, 1999, 42 (16) :3183-3187
[10]   Use of automatic relevance determination in QSAR studies using Bayesian neural networks [J].
Burden, FR ;
Ford, MG ;
Whitley, DC ;
Winkler, DA .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (06) :1423-1430