Managing bias in ROC curves

被引:80
作者
Clark, Robert D. [1 ]
Webster-Clark, Daniel J. [2 ]
机构
[1] Tripos Informat Res Ctr, St Louis, MO 63144 USA
[2] Washington Univ, St Louis, MO 63130 USA
关键词
early recognition; ROC AUC; virtual screening;
D O I
10.1007/s10822-008-9181-z
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Two modifications to the standard use of receiver operating characteristic (ROC) curves for evaluating virtual screening methods are proposed. The first is to replace the linear plots usually used with semi-logarithmic ones (pROC plots), including when doing "area under the curve" (AUC) calculations. Doing so is a simple way to bias the statistic to favor identification of "hits" early in the recovery curve rather than late. A second suggested modification entails weighting each active based on the size of the lead series to which it belongs. Two weighting schemes are described: arithmetic, in which the weight for each active is inversely proportional to the size of the cluster from which it comes; and harmonic, in which weights are inversely proportional to the rank of each active within its class. Either scheme is able to distinguish biased from unbiased screening statistics, but the harmonically weighted AUC in particular emphasizes the ability to place representatives of each class of active early in the recovery curve.
引用
收藏
页码:141 / 146
页数:6
相关论文
共 18 条
[1]  
[Anonymous], 1983, Statistical methods
[2]   Comparing protein-ligand docking programs is difficult [J].
Cole, JC ;
Murray, CW ;
Nissink, JWM ;
Taylor, RD ;
Taylor, R .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 60 (03) :325-332
[3]   The maximum common substructure as a molecular depiction in a supervised classification context:: Experiments in quantitative structure/biodegradability relationships [J].
Cuissart, B ;
Touffet, F ;
Crémilleux, B ;
Bureau, R ;
Rault, S .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (05) :1043-1052
[4]  
Daniel W., 1978, APPL NONPARAMETRIC S
[5]  
EGAN JP, 1975, SIGNAL DETECTION THE
[6]   Aromatic interactions with phenylalanine 691 and cysteine 828: A concept for FMS-like tyrosine kinase-3 inhibition. Application to the discovery of a new class of potential antileukemia agents [J].
Furet, Pascal ;
Bold, Guido ;
Meyer, Thomas ;
Roesel, Johannes ;
Guagnano, Vito .
JOURNAL OF MEDICINAL CHEMISTRY, 2006, 49 (15) :4451-4454
[7]   Measuring CAMD technique performance: A virtual screening case study in the design of validation experiments [J].
Good, AC ;
Hermsmeier, MA ;
Hindle, SA .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2004, 18 (7-9) :529-536
[8]  
GOOD AC, 2008, J COMPUT AIDED MOL D, V22
[9]   Glide: A new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening [J].
Halgren, TA ;
Murphy, RB ;
Friesner, RA ;
Beard, HS ;
Frye, LL ;
Pollard, WT ;
Banks, JL .
JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (07) :1750-1759
[10]  
Hamilton J.T., 1999, CALCULATING RISKS SP