A statistical framework to evaluate virtual screening

被引:84
作者
Zhao, Wei [1 ]
Hevener, Kirk E. [2 ]
White, Stephen W. [3 ,4 ]
Lee, Richard E. [4 ]
Boyett, James M. [1 ]
机构
[1] St Jude Childrens Res Hosp, Dept Biostat, Memphis, TN 38105 USA
[2] Univ Tennessee, Hlth Sci Ctr, Dept Pharmaceut Sci, Memphis, TN USA
[3] St Jude Childrens Res Hosp, Dept Biol Struct, Memphis, TN 38105 USA
[4] Univ Tennessee, Hlth Sci Ctr, Dept Mol Sci, Memphis, TN USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
OPERATING CHARACTERISTIC CURVES; MOLECULAR-DOCKING; LIGAND DOCKING; ENRICHMENT; STRATEGIES; DISCOVERY; PROTOCOLS; AREA; BIAS;
D O I
10.1186/1471-2105-10-225
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Receiver operating characteristic (ROC) curve is widely used to evaluate virtual screening (VS) studies. However, the method fails to address the "early recognition" problem specific to VS. Although many other metrics, such as RIE, BEDROC, and pROC that emphasize "early recognition" have been proposed, there are no rigorous statistical guidelines for determining the thresholds and performing significance tests. Also no comparisons have been made between these metrics under a statistical framework to better understand their performances. Results: We have proposed a statistical framework to evaluate VS studies by which the threshold to determine whether a ranking method is better than random ranking can be derived by bootstrap simulations and 2 ranking methods can be compared by permutation test. We found that different metrics emphasize "early recognition" differently. BEDROC and RIE are 2 statistically equivalent metrics. Our newly proposed metric SLR is superior to pROC. Through extensive simulations, we observed a "seesaw effect"-overemphasizing early recognition reduces the statistical power of a metric to detect true early recognitions. Conclusion: The statistical framework developed and tested by us is applicable to any other metric as well, even if their exact distribution is unknown. Under this framework, a threshold can be easily selected according to a pre-specified type I error rate and statistical comparisons between 2 ranking methods becomes possible. The theoretical null distribution of SLR metric is available so that the threshold of SLR can be exactly determined without resorting to bootstrap simulations, which makes it easy to use in practical virtual screening studies.
引用
收藏
页数:13
相关论文
共 36 条
[11]  
Davison A.C., 2006, BOOTSTRAP METHODS TH
[12]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[13]   Molecular docking and high-throughput screening for novel inhibitors of protein tyrosine phosphatase-1B [J].
Doman, TN ;
McGovern, SL ;
Witherbee, BJ ;
Kasten, TP ;
Kurumbail, R ;
Stallings, WC ;
Connolly, DT ;
Shoichet, BK .
JOURNAL OF MEDICINAL CHEMISTRY, 2002, 45 (11) :2213-2221
[14]   SELECTION AND INTERPRETATION OF DIAGNOSTIC-TESTS AND PROCEDURES - PRINCIPLES AND APPLICATIONS [J].
GRINER, PF ;
MAYEWSKI, RJ ;
MUSHLIN, AI ;
GREENLAND, P .
ANNALS OF INTERNAL MEDICINE, 1981, 94 (04) :553-+
[15]   A METHOD OF COMPARING THE AREAS UNDER RECEIVER OPERATING CHARACTERISTIC CURVES DERIVED FROM THE SAME CASES [J].
HANLEY, JA ;
MCNEIL, BJ .
RADIOLOGY, 1983, 148 (03) :839-843
[16]   THE MEANING AND USE OF THE AREA UNDER A RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE [J].
HANLEY, JA ;
MCNEIL, BJ .
RADIOLOGY, 1982, 143 (01) :29-36
[17]   Validation of Molecular Docking Programs for Virtual Screening against Dihydropteroate Synthase [J].
Hevener, Kirk E. ;
Zhao, Wei ;
Ball, David M. ;
Babaoglu, Kerim ;
Qi, Jianjun ;
White, Stephen W. ;
Lee, Richard E. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (02) :444-460
[18]   Bias, reporting, and sharing: computational evaluations of docking methods [J].
Jain, Ajay N. .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2008, 22 (3-4) :201-212
[19]  
Jain AN, 2004, CURR OPIN DRUG DISC, V7, P396
[20]   Enhancing drug discovery through in silico screening:: Strategies to increase true positives retrieval rates [J].
Kirchmair, J. ;
Distinto, S. ;
Schuster, D. ;
Spitzer, G. ;
Langer, T. ;
Wolber, G. .
CURRENT MEDICINAL CHEMISTRY, 2008, 15 (20) :2040-2053