Community benchmarks for virtual screening

被引:127
作者
Irwin, John J. [1 ]
机构
[1] Univ Calif San Francisco, Dept Pharmaceut Chem, San Francisco, CA 94158 USA
关键词
virtual screening; benchmarking; enrichment; decoys;
D O I
10.1007/s10822-008-9189-4
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands physically, so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands physically but not topologically to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chemical space, and the proper scope for using DUD. Careful attention to both the composition of benchmarks and how they are used is essential to avoid being misled by overfitting and bias.
引用
收藏
页码:193 / 199
页数:7
相关论文
共 42 条
  • [1] ABAGYAN RA, 1994, J COMPUT CHEM, V14, P488
  • [2] Binding MOAD, a high-quality protein-ligand database
    Benson, Mark L.
    Smith, Richard D.
    Khazanov, Nickolay A.
    Dimcheff, Brandon
    Beaver, John
    Dresslar, Peter
    Nerothin, Jason
    Carlson, Heather A.
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D674 - D678
  • [3] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [4] Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations
    Bissantz, C
    Folkers, G
    Rognan, D
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2000, 43 (25) : 4759 - 4767
  • [5] FREE R-VALUE - A NOVEL STATISTICAL QUANTITY FOR ASSESSING THE ACCURACY OF CRYSTAL-STRUCTURES
    BRUNGER, AT
    [J]. NATURE, 1992, 355 (6359) : 472 - 475
  • [6] Byvatov Evgeny, 2003, Appl Bioinformatics, V2, P67
  • [7] CHEREZOV V, 2007, SCIENCE, V366
  • [8] Assessing scoring functions for protein-ligand interactions
    Ferrara, P
    Gohlke, H
    Price, DJ
    Klebe, G
    Brooks, CL
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (12) : 3032 - 3047
  • [9] Soft docking and multiple receptor conformations in virtual screening
    Ferrari, AM
    Wei, BQQ
    Costantino, L
    Shoichet, BK
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (21) : 5076 - 5084
  • [10] Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery
    Fink, Tobias
    Reymond, Jean-Louis
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (02) : 342 - 353