Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation

被引:255
作者
Geppert, Hanna [1 ]
Vogt, Martin [1 ]
Bajorath, Juergen [1 ]
机构
[1] Univ Bonn, Dept Life Sci Informat, B IT, Unit Chem Biol & Med Chem,LIMES Program, D-53113 Bonn, Germany
关键词
SUPPORT VECTOR MACHINES; AIDED CHEMICAL BIOLOGY; CAMD TECHNIQUE PERFORMANCE; FORMAL CONCEPT ANALYSIS; DATA FUSION; SEARCH PERFORMANCE; SCALING INCREASES; ACTIVE COMPOUNDS; 2D FINGERPRINTS; SIMILARITY;
D O I
10.1021/ci900419k
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The increasing relevance and the use of data mining methods in virtual compound screening are studied. Virtual screening has evolved from traditional similarity searching, using single reference compounds, into an advanced application domain for data mining and machine learning methods that require compound reference sets of increasing size and of high-information content for training. Three approaches SVM learning, Bayesian methods and decision trees have been highlighted. Support vector machines represent a relatively new data mining methodology that has become popular during the 1990s based on the works of Vapnik and Cortes. Bayesian methods generally rely on the estimation of probability distributions of numerical representations of compounds based on property descriptors or fingerprints. Decision trees continue to be applied in chemoinfomatics, especially in the form of ensemble-based random forest models, which are utilized.
引用
收藏
页码:205 / 216
页数:12
相关论文
共 114 条
  • [1] *ACC INC, 2008, SCIT PIP PIL
  • [2] Bayesian Model Averaging for Ligand Discovery
    Angelopoulos, Nicos
    Hadjiprocopis, Andreas
    Walkinshaw, Malcolm D.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (06) : 1547 - 1557
  • [3] [Anonymous], 2005, MACCS STRUCT KEYS
  • [4] [Anonymous], PubChem
  • [5] [Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
  • [6] One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties
    Azencott, Chloe-Agathe
    Ksikes, Alexandre
    Swamidass, S. Joshua
    Chen, Jonathan H.
    Ralaivola, Liva
    Baldi, Pierre
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (03) : 965 - 974
  • [7] The use of consensus scoring in ligand-based virtual screening
    Baber, JC
    William, AS
    Gao, YH
    Feher, M
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (01) : 277 - 288
  • [8] Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening
    Bajorath, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02): : 233 - 245
  • [9] Computational approaches in chemogenomics and chemical biology: current and future impact on drug discovery
    Bajorath, Juergen
    [J]. EXPERT OPINION ON DRUG DISCOVERY, 2008, 3 (12) : 1371 - 1376
  • [10] Speeding up chemical database searches using a proximity filter based on the logical exclusive OR
    Baldi, Pierre
    Hirschberg, Daniel S.
    Nasr, Ramzi J.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (07) : 1367 - 1378