Optimal QSAR analysis of the carcinogenic activity of drugs by correlation ranking and genetic algorithm-based

被引:62
作者
Hemmateenejad, B [1 ]
机构
[1] Shiraz Univ Med Sci, Med & Nat Prod Chem Res Ctr, Shiraz, Iran
关键词
principal component regression (PCR); partial least squares (PLS); factor selection; correlation ranking; genetic algorithm; carcinogenic activity; drug;
D O I
10.1002/cem.891
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The major problem associated with principal component regression (PCR), especially in QSAR studies, is that this model extracts the eigenvectors solely from the matrix of predictor variables and therefore they might not have an essentially good relationship with the predicted variable. This paper describes the application of PCR to model the structure-carcinogenic activity of drugs. To obtain the optimal model, correlation ranking and a genetic algorithm were employed for selecting the best set of principal components (PCs). A large data set containing 735 carcinogenic activities and 1355 descriptors was used. Two cross-validation procedures (leave-many-out and nu-fold cross-validation) and the hold-out-a-test-sample (HOTS) method were used to validate the models. It was found that introduction of PCs by the conventional eigenvalue ranking procedure did not produce the perfect model. Instead, factor selection by correlation ranking and genetic algorithm produced good models of similar quality. The models could explain more than 80% of the variances in carcinogenic activity. Copyright (C) 2005 John Wiley & Sons, Ltd.
引用
收藏
页码:475 / 485
页数:11
相关论文
共 58 条
[1]   Quantitative structure - Micellization relationship study of gemini surfactants using genetic-PLS and genetic-MLR [J].
Absalan, G ;
Hemmateenejad, B ;
Soleimani, M ;
Akhond, M ;
Miri, R .
QSAR & COMBINATORIAL SCIENCE, 2004, 23 (06) :416-425
[2]  
[Anonymous], 1989, GENETIC ALGORITHM SE
[3]   Genetic algorithm applied to the selection of principal components [J].
Barros, AS ;
Rutledge, DN .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1998, 40 (01) :65-81
[4]   Predicting blood:air partition coefficients using theoretical molecular descriptors [J].
Basak, SC ;
Mills, D ;
El-Masri, HA ;
Mumtaz, MM ;
Hawkins, DM .
ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY, 2004, 16 (1-2) :45-55
[5]   PREDICTING MUTAGENICITY OF CHEMICALS USING TOPOLOGICAL AND QUANTUM-CHEMICAL PARAMETERS - A SIMILARITY BASED STUDY [J].
BASAK, SC ;
GRUNWALD, GD .
CHEMOSPHERE, 1995, 31 (01) :2529-2546
[6]   A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part I. Search algorithm, theory and simulations [J].
Baumann, K ;
Albert, H ;
von Korff, M .
JOURNAL OF CHEMOMETRICS, 2002, 16 (07) :339-350
[7]   The 4-indolyl-2-guanidinothiazoles QSAR study of anti-ulcer activity using quantum descriptors [J].
Borges, EG ;
Takahata, Y .
JOURNAL OF MOLECULAR STRUCTURE-THEOCHEM, 2002, 580 :263-270
[9]   Exploring QSAR with E-state index: selectivity requirements for COX-2 versus COX-1 binding of terphenyl methyl sulfones and sulfonamides [J].
Chakraborty, S ;
Sengupta, C ;
Roy, K .
BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, 2004, 14 (18) :4665-4670
[10]  
Clark RD, 2001, RATIONAL APPROACHES TO DRUG DESIGN, P475