Evaluation of machine-learning methods for ligand-based virtual screening

被引:100
作者
Chen, Beining
Harrison, Robert F.
Papadatos, George
Willett, Peter
Wood, David J.
Lewell, Xiao Qing
Greenidge, Paulette
Stiefl, Nikolaus
机构
[1] Univ Sheffield, Krebs Inst Biomolec Res, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Sheffield, Dept Informat Studies, Sheffield S1 4DP, S Yorkshire, England
[3] GlaxoSmithKline Res & Dev Ltd, Stevenage SG1 2NY, Herts, England
[4] Novartis Pharma AG, CH-4056 Basel, Switzerland
[5] Univ Sheffield, Krebs Inst Biomolec Res, Sheffield S10 2TN, S Yorkshire, England
[6] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
[7] Univ Sheffield, Dept Automat Control & Syst Engn, Sheffield S1 3JD, S Yorkshire, England
[8] Univ Sheffield, Dept Chem, Sheffield S3 7HF, S Yorkshire, England
基金
英国生物技术与生命科学研究理事会; 英国工程与自然科学研究理事会;
关键词
group fusion; kernel discrimination; ligand-based virtual screening; machine learning; naive Bayesian classifier; similarity searching; virtual screening;
D O I
10.1007/s10822-006-9096-5
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed.
引用
收藏
页码:53 / 62
页数:10
相关论文
共 69 条
[31]   New methods for ligand-based virtual screening: Use of data fusion and machine learning to enhance the effectiveness of similarity searching [J].
Hert, J ;
Willett, P ;
Wilton, DJ ;
Acklin, P ;
Azzaoui, K ;
Jacoby, E ;
Schuffenhauer, A .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (02) :462-470
[32]   Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures [J].
Hert, J ;
Willett, P ;
Wilton, DJ .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :1177-1185
[33]   Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures [J].
Hert, J ;
Willett, P ;
Wilton, DJ ;
Acklin, P ;
Azzaoui, K ;
Jacoby, E ;
Schuffenhauer, A .
ORGANIC & BIOMOLECULAR CHEMISTRY, 2004, 2 (22) :3256-3266
[34]   STATISTICAL-HEURISTIC METHOD FOR AUTOMATED SELECTION OF DRUGS FOR SCREENING [J].
HODES, L ;
HAZARD, GF ;
GERAN, RI ;
RICHMAN, S .
JOURNAL OF MEDICINAL CHEMISTRY, 1977, 20 (04) :469-475
[35]   COMPUTER-AIDED SELECTION OF COMPOUNDS FOR ANTI-TUMOR SCREENING - VALIDATION OF A STATISTICAL-HEURISTIC METHOD [J].
HODES, L .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1981, 21 (03) :128-132
[36]   SELECTION OF MOLECULAR FRAGMENT FEATURES FOR STRUCTURE-ACTIVITY STUDIES IN ANTI-TUMOR SCREENING [J].
HODES, L .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1981, 21 (03) :132-136
[37]  
Johnson M., 1990, CONCEPTS APPL MOL SI
[38]   Docking and scoring in virtual screening for drug discovery: Methods and applications [J].
Kitchen, DB ;
Decornez, H ;
Furr, JR ;
Bajorath, J .
NATURE REVIEWS DRUG DISCOVERY, 2004, 3 (11) :935-949
[39]  
Klebe G., 2000, VIRTUAL SCREENING AL
[40]   Combination of a naive Bayes classifier with consensus scoring improves enrichment of high-throughput docking results [J].
Klon, AE ;
Glick, M ;
Davies, JW .
JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (18) :4356-4359