Bayesian Model Averaging for Ligand Discovery

被引:9
作者
Angelopoulos, Nicos [1 ]
Hadjiprocopis, Andreas [2 ]
Walkinshaw, Malcolm D. [1 ]
机构
[1] Univ Edinburgh, Dept Biol Sci, Edinburgh EH8 9YL, Midlothian, Scotland
[2] Higher Tech Inst, Dept Comp, Nicosia, Cyprus
基金
英国生物技术与生命科学研究理事会;
关键词
SUPPORT VECTOR MACHINE; SOLUBILITY;
D O I
10.1021/ci900046u
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
High-throughput screening (HTS) is now a standard approach used in the pharmaceutical industry to identify potential drug-like lead molecules. The analysis linking biological data with molecular properties is a major goal in both academic and pharmaceutical research. This paper presents a Bayesian analysis of high-dimensional descriptor data using Markov chain Monte Carlo (MCMC) simulations for learning classification trees as a novel method for pharmacophore and ligand discovery. We use experimentally determined binding affinity data with the protein pyruvate kinase to train and assess our model averaging algorithm and then apply it to a large database of over 3.7 million molecules. We compare the results of a number of variations on the central Bayesian theme to that of two Neural Network (NN) architectures and that of Support Vector Machines (SVM). The main Bayesian algorithm, in addition to achieving high specificity and sensitivity, also lends itself naturally to classifying test sets with missing data and providing a ranking for the classified compounds. The approach has been used to select and rank potential biologically active compounds and could provide a powerful tool in compound testing.
引用
收藏
页码:1547 / 1557
页数:11
相关论文
共 30 条
  • [1] STATISTICS NOTES - DIAGNOSTIC-TESTS-1 - SENSITIVITY AND SPECIFICITY .3.
    ALTMAN, DG
    BLAND, JM
    [J]. BRITISH MEDICAL JOURNAL, 1994, 308 (6943) : 1552 - 1552
  • [2] Angelopoulos Nicos, 2005, P 22 INT C MACHINE L, P17
  • [3] [Anonymous], ARTIFICIAL INTELLIGE
  • [4] Bishop Christopher M, 1995, Neural networks for pattern recognition
  • [5] Breiman L., 1984, BIOMETRICS, V40, P874, DOI 10.1201/9781315139470
  • [6] Buntine W., 1992, Statistics and Computing, V2, P63, DOI 10.1007/BF01889584
  • [7] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [8] Comparison of support vector machine and artificial neural network systems for drug/nondrug classification
    Byvatov, E
    Fechner, U
    Sadowski, J
    Schneider, G
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (06): : 1882 - 1889
  • [9] Chan Maurice, 2007, Travel Med Infect Dis, V5, P125, DOI 10.1016/j.tmaid.2006.01.015
  • [10] CHIPMAN H, 2008, J AM STAT ASSOC, V93, P935