Predictive QSAR Modeling workflow, model applicability domains, and virtual screening

被引:368
作者
Tropsha, Alexander [1 ,2 ]
Golbraikh, Alexander [1 ,2 ]
机构
[1] Univ N Carolina, Sch Pharm, Lab Mol Modeling, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Sch Pharm, Carolina Ctr Exploratory Cheminformat Res, Chapel Hill, NC 27599 USA
关键词
QSAR-quantitative structure activity relationships; combi-QSAR-combinatorial QSAR; kNN - k nearest neighbors; SA - simulating annealing; PLS - partial least squares;
D O I
10.2174/138161207782794257
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
Quantitative Structure Activity Relationship (QSAR) modeling has been traditionally applied as an evaluative approach, i.e., with the focus on developing retrospective and explanatory models of existing data. Model extrapolation was considered if only in hypothetical sense in terms of potential modifications of known biologically active chemicals that could improve compounds' activity. This critical review re-examines the strategy and the output of the modern QSAR modeling approaches. We provide examples and arguments suggesting that current methodologies may afford robust and validated models capable of accurate prediction of compound properties for molecules not included in the training sets. We discuss a data-analytical modeling workflow developed in our laboratory that incorporates modules for combinatorial QSAR model development (i.e., using all possible binary combinations of available descriptor sets and statistical data modeling techniques), rigorous model validation, and virtual screening of available chemical databases to identify novel biologically active compounds. Our approach places particular emphasis on model validation as well as the need to define model applicability domains in the chemistry space. We present examples of studies where the application of rigorously validated QSAR models to virtual screening identified computational hits that were confirmed by subsequent experimental investigations. The emerging focus of QSAR modeling on target property forecasting brings it forward as predictive, as opposed to evaluative, modeling approach.
引用
收藏
页码:3494 / 3504
页数:11
相关论文
共 112 条
[1]   A novel QSAR model for predicting induction of apoptosis by 4-aryl-4H-chromenes [J].
Afantitis, Antreas ;
Melagraki, Georgia ;
Sarimveis, Haralambos ;
Koutentis, Panayiotis A. ;
Markopoulos, John ;
Igglessi-Markopoulou, Olga .
BIOORGANIC & MEDICINAL CHEMISTRY, 2006, 14 (19) :6686-6694
[2]   Prediction of enantiomeric excess in a combinatorial library of catalytic enantioselective reactions [J].
Aires-de-Sousa, J ;
Gasteiger, J .
JOURNAL OF COMBINATORIAL CHEMISTRY, 2005, 7 (02) :298-301
[3]   NIH Molecular Libraries Initiative [J].
Austin, CP ;
Brady, LS ;
Insel, TR ;
Collins, FS .
SCIENCE, 2004, 306 (5699) :1138-1139
[4]  
BARNARD JM, 1995, FINGERPRINT DESCRIPT
[5]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[8]  
*CHEM DIV, 2004, CHEMDIV CHEM DAT
[9]   'Inductive' Descriptors: 10 Successful Years in QSAR [J].
Cherkasov, A. .
CURRENT COMPUTER-AIDED DRUG DESIGN, 2005, 1 (01) :21-42
[10]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482