Virtual screening of molecular databases using a Support Vector Machine

被引:205
作者
Jorissen, RN [1 ]
Gilson, MK [1 ]
机构
[1] Univ Maryland, Biotechnol Inst, Ctr Adv Res Biotechnol, Rockville, MD 20850 USA
关键词
D O I
10.1021/ci049641u
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses cross-validation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprint-based methods such as binary kernel discrimination.
引用
收藏
页码:549 / 561
页数:13
相关论文
共 32 条
  • [1] Integration of virtual and high-throughput screening
    Bajorath, F
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2002, 1 (11) : 882 - 894
  • [2] Bajorath J., 2002, Curr. Drug Discov., V3, P24
  • [3] Drug design by machine learning: support vector machines for pharmaceutical data analysis
    Burbidge, R
    Trotter, M
    Buxton, B
    Holden, S
    [J]. COMPUTERS & CHEMISTRY, 2001, 26 (01): : 5 - 14
  • [4] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [5] CHANG CC, 2001, LIBSVM LIBR SUPPORT
  • [6] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
  • [7] JChem: Java']Java applets and modules supporting chemical database handling from web browsers
    Csizmadia, F
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (02): : 323 - 324
  • [8] DESCRIPTION OF SEVERAL CHEMICAL-STRUCTURE FILE FORMATS USED BY COMPUTER-PROGRAMS DEVELOPED AT MOLECULAR DESIGN LIMITED
    DALBY, A
    NOURSE, JG
    HOUNSHELL, WD
    GUSHURST, AKI
    GRIER, DL
    LELAND, BA
    LAUFER, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (03): : 244 - 255
  • [9] Similarity searching in files of three-dimensional chemical structures: Evaluation of the EVA descriptor and combination of rankings using data fusion
    Ginn, CMR
    Turner, DB
    Willett, P
    Ferguson, AM
    Heritage, TW
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01): : 23 - 37
  • [10] Rational selection of training and test sets for the development of validated QSAR models
    Golbraikh, A
    Shen, M
    Xiao, ZY
    Xiao, YD
    Lee, KH
    Tropsha, A
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2003, 17 (02) : 241 - 253