Ensemble of linear models for predicting drug properties

被引:25
作者
Arodz, T [1 ]
Yuen, DA
Dudek, AZ
机构
[1] AGH Univ Sci & Technol, Inst Comp Sci, PL-30059 Krakow, Poland
[2] Univ Minnesota, Minnesota Supercomp Inst, Minneapolis, MN 55455 USA
[3] Univ Minnesota, Sch Med, Dept Med, Minneapolis, MN 55455 USA
关键词
D O I
10.1021/ci050375+
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We propose a new classification method for the prediction of drug properties, called random feature subset boosting for linear discriminant analysis (LDA). The main novelty of this method is the ability to overcome the problems with constructing ensembles of linear discriminant models based on generalized eigenvectors of covariance matrices. Such linear models are popular in building classification-based structure-activity relationships. The introduction of ensembles of LDA models allows for an analysis of more complex problems than by using single LDA, for example, those involving multiple mechanisms of action. Using four data sets, we show experimentally that the method is competitive with other recently studied chemoinformatic methods, including support vector machines and models based on decision trees. We present an easy scheme for interpreting the model despite its apparent sophistication. We also outline theoretical evidence as to why, contrary to the conventional AdaBoost ensemble algorithm, this method is able to increase the accuracy of LDA models.
引用
收藏
页码:416 / 423
页数:8
相关论文
共 47 条
[1]   On the use of neural network ensembles in QSAR and QSPR [J].
Agrafiotis, DK ;
Cedeño, W ;
Lobanov, VS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (04) :903-911
[2]  
[Anonymous], 1982, ESTIMATION DEPENDENC
[3]  
ARODZ T, 2005, COMPUTER RECOGNITION
[4]   Classification of multidrug-resistance reversal agents using structure-based descriptors and linear discriminant analysis [J].
Bakken, GA ;
Jurs, PC .
JOURNAL OF MEDICINAL CHEMISTRY, 2000, 43 (23) :4534-4541
[5]   Hit and lead generation:: Beyond high-throughput screening [J].
Bleicher, KH ;
Böhm, HJ ;
Müller, K ;
Alanine, AI .
NATURE REVIEWS DRUG DISCOVERY, 2003, 2 (05) :369-378
[6]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[7]  
Brown G., 2005, Information Fusion, V6, P5, DOI 10.1016/j.inffus.2004.04.004
[8]   Drug design by machine learning: support vector machines for pharmaceutical data analysis [J].
Burbidge, R ;
Trotter, M ;
Buxton, B ;
Holden, S .
COMPUTERS & CHEMISTRY, 2001, 26 (01) :5-14
[9]  
Debnath Asim Kumar, 2001, Mini-Reviews in Medicinal Chemistry, V1, P187, DOI 10.2174/1389557013407061
[10]  
Ekins Sean, 2002, J Mol Graph Model, V20, P305, DOI 10.1016/S1093-3263(01)00127-9