Toward an optimal procedure for variable selection and QSAR model building

被引:164
作者
Yasri, A [1 ]
Hartsough, D [1 ]
机构
[1] ArQule Inc, Computat Design Grp, Woburn, MA 01801 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2001年 / 41卷 / 05期
关键词
D O I
10.1021/ci010291a
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this work, we report the development of a novel QSAR technique combining genetic algorithms and neural networks for selecting a subset of relevant descriptors and building the optimal neural network architecture for QSAR studies. This technique uses a neural network to map the dependent property of interest with the descriptors preselected by the genetic algorithm. This technique differs from other variable selection techniques combining genetic algorithms to neural networks by two main features: (1) The variable selection search performed by the genetic algorithm is not constrained to a defined number of descriptors. (2) The optimal neural network architecture is explored in parallel with the variable selection by dynamically modifying the size of the hidden layer. By using both artificial data and real biological data, we show that this technique can be used to build both classification and regression models and outperforms simpler variable selection techniques mainly for nonlinear data sets. The results obtained on real data are compared to previous work using other modeling techniques. We also discuss some important issues in building QSAR models and good practices for QSAR studies.
引用
收藏
页码:1218 / 1227
页数:10
相关论文
共 57 条
[41]   Genetic neural networks for quantitative structure-activity relationships: Improvements and application of benzodiazepine affinity for benzodiazepine/GABA(A) receptors [J].
So, SS ;
Karplus, M .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (26) :5246-5256
[42]   Neural network studies .2. Variable selection [J].
Tetko, IV ;
Villa, AEP ;
Livingstone, DJ .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (04) :794-803
[43]   NEURAL-NETWORK STUDIES .1. COMPARISON OF OVERFITTING AND OVERTRAINING [J].
TETKO, IV ;
LIVINGSTONE, DJ ;
LUIK, AI .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1995, 35 (05) :826-833
[44]   CHANCE FACTORS IN STUDIES OF QUANTITATIVE STRUCTURE-ACTIVITY-RELATIONSHIPS [J].
TOPLISS, JG ;
EDWARDS, RP .
JOURNAL OF MEDICINAL CHEMISTRY, 1979, 22 (10) :1238-1244
[45]  
*U TX AUST LAB MOL, 1997, DIV V 3 0 2
[46]   Estimation of blood-brain barrier crossing of drugs using molecular size and shape, and H-bonding descriptors [J].
van de Waterbeemd, H ;
Camenisch, G ;
Folkers, G ;
Chretien, JR ;
Raevsky, OA .
JOURNAL OF DRUG TARGETING, 1998, 6 (02) :151-165
[47]  
VANDEWATERBEEME.H, 1995, CHEMOMETRICS METHODS, V2
[48]   NEURAL NETWORKS IN PHARMACODYNAMIC MODELING - IS CURRENT MODELING PRACTICE OF COMPLEX KINETIC SYSTEMS AT A DEAD END [J].
VENGPEDERSEN, P ;
MODI, NB .
JOURNAL OF PHARMACOKINETICS AND BIOPHARMACEUTICS, 1992, 20 (04) :397-412
[49]  
Verloop A, 1976, DRUG DESIGN, V7, P165, DOI DOI 10.1016/B978-0-12-060307-7.50010-9
[50]   Development and validation of a novel variable selection technique with application to multidimensional quantitative structure-activity relationship studies [J].
Waller, CL ;
Bradley, MP .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (02) :345-355