Optimal Sparse Descriptor Selection for QSAR Using Bayesian Methods

被引:66
作者
Burden, F. R. [1 ]
Winkler, D. A.
机构
[1] CSIRO Mol & Hlth Technol, Clayton, Vic 3168, Australia
来源
QSAR & COMBINATORIAL SCIENCE | 2009年 / 28卷 / 6-7期
关键词
Medicinal chemistry; Structure-activity relationships; Feature selection; Bayesian methods; Descriptors; ARTIFICIAL NEURAL-NETWORKS; SUPPORT VECTOR MACHINE; GENETIC ALGORITHMS; DRUG DISCOVERY; MOLECULAR DESCRIPTORS; VARIABLE SELECTION; MODELS; VALIDATION; PREDICTION; DOMAIN;
D O I
10.1002/qsar.200810173
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Choosing a set of molecular descriptors (features) that is most relevant to a given biological response variable is a very important problem in QSAR that has not be solved in an optimal robust way. It is an interesting and important class of mathematical problems, where the number of variables greatly outweighs the number of observations (grossly underdetermined systems). We have used two Bayesian approaches to carry out this task using a suite of QSAR data sets. We employed a specialized sparse Bayesian feature reduction method based on an EM algorithm with a Laplacian prior to select a small set of the most relevant descriptors for modeling the response variables from a much larger pool of possibilities. Having chosen the optimum descriptors in a supervised manner, we used a Bayesian regularized neural network to carry out nonlinear regression and derive robust parsimonious QSAR models for five drug data sets. Models were validated using independent test sets, and results compared with other contemporary descriptor selection methods. Issues around validating small QSAR data sets were also discussed in detail. The sparse feature selection algorithm proved to be an excellent, robust method for selecting descriptors for QSAR models, as it is supervised (descriptors chosen in a context-dependent manner), parsimonious (models not overly complex), and inherently interpretable. Coupled to a robust parsimonious nonlinear modeling method such as the Bayesian regularized neural net, the combination provides a means of optimally modeling the data, and allowing interpretation of the model in terms of the most relevant descriptors.
引用
收藏
页码:645 / 653
页数:9
相关论文
共 52 条
[31]   Support vector machine for SAR/QSAR of phenethyl-amines [J].
Niu, Bing ;
Lu, Wen-cong ;
Yang, Shan-sheng ;
Cai, Yu-dong ;
Li, Guo-zheng .
ACTA PHARMACOLOGICA SINICA, 2007, 28 (07) :1075-1086
[32]   A QSAR for baseline toxicity:: Validation, domain of application, and prediction [J].
Öberg, T .
CHEMICAL RESEARCH IN TOXICOLOGY, 2004, 17 (12) :1630-1637
[33]   Predictive human intestinal absorption QSAR models using Bayesian regularized neural networks [J].
Polley, MJ ;
Burden, FR ;
Winkler, DA .
AUSTRALIAN JOURNAL OF CHEMISTRY, 2005, 58 (12) :859-863
[34]   Broad-based quantitative structure-activity relationship modeling of potency and selectivity of farnesyltransferase inhibitors using a Bayesian regularized neural network [J].
Polley, MJ ;
Winkler, DA ;
Burden, FR .
JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (25) :6230-6238
[35]   THE USE OF ARTIFICIAL NEURAL NETWORKS IN QSAR [J].
SALT, DW ;
YILDIZ, N ;
LIVINGSTONE, DJ ;
TINSLEY, CJ .
PESTICIDE SCIENCE, 1992, 36 (02) :161-170
[36]   Estimating the domain of applicability for machine learning QSAR models:: a study on aqueous solubility of drug discovery molecules [J].
Schroeter, Timon Sebastian ;
Schwaighofer, Anton ;
Mika, Sebastian ;
Ter Laak, Antonius ;
Suelzle, Detlev ;
Ganzer, Ursula ;
Heinrich, Nikolaus ;
Mueller, Klaus-Robert .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2007, 21 (09) :485-498
[37]   STRUCTURE-ACTIVITY-RELATIONSHIPS OF ANTIFILARIAL ANTIMYCIN ANALOGS - A MULTIVARIATE PATTERN-RECOGNITION STUDY [J].
SELWOOD, DL ;
LIVINGSTONE, DJ ;
COMLEY, JCW ;
ODOWD, AB ;
HUDSON, AT ;
JACKSON, P ;
JANDU, KS ;
ROSE, VS ;
STABLES, JN .
JOURNAL OF MEDICINAL CHEMISTRY, 1990, 33 (01) :136-142
[38]   Genetic neural networks for quantitative structure-activity relationships: Improvements and application of benzodiazepine affinity for benzodiazepine/GABA(A) receptors [J].
So, SS ;
Karplus, M .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (26) :5246-5256
[39]   Evolutionary optimization in quantitative structure-activity relationship: An application of genetic neural networks [J].
So, SS ;
Karplus, M .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (07) :1521-1530
[40]  
Terfloth L, 2001, DRUG DISCOV TODAY, V6, pS102