Improving structure-based virtual screening by multivariate analysis of scoring data

被引:112
作者
Jacobsson, M [1 ]
Lidén, P
Stjernschantz, E
Boström, H
Norinder, U
机构
[1] Biovitrum AB, Struct Chem, SE-11276 Stockholm, Sweden
[2] Uppsala Univ, Dept Med Chem, BMC, SE-75123 Uppsala, Sweden
[3] Compumine AB, SE-16440 Kista, Sweden
[4] Stockholm Univ, Dept Comp Sci & Syst, SE-16440 Kista, Sweden
[5] Royal Inst Technol, SE-16440 Kista, Sweden
[6] AstraZeneca R&D Sodertalje, SE-15185 Sodertalje, Sweden
关键词
D O I
10.1021/jm030896t
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Three different multivariate statistical methods, PLS discriminant analysis, rule-based methods, and Bayesian classification, have been applied to multidimensional scoring data from four different target proteins: estrogen receptor alpha (ERalpha), matrix metalloprotease 3 (MMP3), factor Xa (fXa), and acetylcholine esterase (AChE). The purpose was to build classifiers able to discriminate between active and inactive compounds, given a structure-based virtual screen. Seven different scoring functions were used to generate the scoring matrices. The classifiers were compared to classical consensus scoring and single scoring functions. The classifiers show a superior performance, with rule-based methods being most effective. The precision of correctly predicting an active compound is about 90% for three of the targets and about 25% for acetylcholine esterase. On the basis of these results, a new two-stage approach is suggested for structure-based virtual screening where limited activity information is available.
引用
收藏
页码:5781 / 5789
页数:9
相关论文
共 42 条
[1]   BIASED PROBABILITY MONTE-CARLO CONFORMATIONAL SEARCHES AND ELECTROSTATIC CALCULATIONS FOR PEPTIDES AND PROTEINS [J].
ABAGYAN, R ;
TOTROV, M .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (03) :983-1002
[2]   High-throughput docking for lead generation [J].
Abagyan, R ;
Totrov, M .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2001, 5 (04) :375-382
[3]  
Åqvist J, 2001, COMB CHEM HIGH T SCR, V4, P613
[4]   SYBYL line notation (SLN): A versatile language for chemical structure representation [J].
Ash, S ;
Cline, MA ;
Homer, RW ;
Hurst, T ;
Smith, GB .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01) :71-79
[5]   Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations [J].
Bissantz, C ;
Folkers, G ;
Rognan, D .
JOURNAL OF MEDICINAL CHEMISTRY, 2000, 43 (25) :4759-4767
[6]  
BOSTROM H, 1995, 14 INT JOINT C ART I, P1194
[7]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[8]   Molecular basis of agonism and antagonism in the oestrogen receptor [J].
Brzozowski, AM ;
Pike, ACW ;
Dauter, Z ;
Hubbard, RE ;
Bonn, T ;
Engstrom, O ;
Ohman, L ;
Greene, GL ;
Gustafsson, JA ;
Carlquist, M .
NATURE, 1997, 389 (6652) :753-758
[9]   Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins [J].
Charifson, PS ;
Corkery, JJ ;
Murcko, MA ;
Walters, WP .
JOURNAL OF MEDICINAL CHEMISTRY, 1999, 42 (25) :5100-5109
[10]   OptiSim: An extended dissimilarity selection method for finding diverse representative subsets [J].
Clark, RD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (06) :1181-1188