A hierarchical clustering approach for large compound libraries

被引:41
作者
Böcker, A
Derksen, S
Schmidt, E
Teckentrup, A
Schneider, G
机构
[1] Goethe Univ Frankfurt, Inst Organ Chem & Chem Biol, D-60439 Frankfurt, Germany
[2] Boehringer Ingelheim Pharma GmbH & Co KG, Dept Lead Discovery, D-88397 Biberach Ad Riss, Germany
关键词
D O I
10.1021/ci0500029
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A modified version of the k-means clustering algorithm was developed that is able to analyze large compound libraries. A distance threshold determined by plotting the sum of radii of leaf clusters was used as a termination criterion for the clustering process. Hierarchical trees were constructed that can be used to obtain an overview of the data distribution and inherent cluster structure. The approach is also applicable to Iigand-based virtual screening with the aim to generate preferred screening collections or focused compound libraries. Retrospective analysis of two activity classes was performed: inhibitors of caspase 1 [interleukin 1 (IL1) cleaving enzyme, ICE] and glucocorticoid receptor ligands. The MDL Drug Data Report (MDDR) and Collection of Bioactive Reference Analogues (COBRA) databases served as the compound pool, for which binary trees were produced. Molecules were encoded by all Molecular Operating Environment 2D descriptors and topological pharmacophore atom types. Individual clusters were assessed for their purity and enrichment of actives belonging, to the two ligand classes. Si nificant enrichment was observed in individual branches of the cluster tree. After clustering a combined database of MDDR, COBRA, and the SPECS catalog, it was possible to retrieve MDDR ICE inhibitors with new scaffolds using COBRA ICE inhibitors as seeds. A Java implementation of the clustering method is available via the Internet (http://www.modlab.de).
引用
收藏
页码:807 / 815
页数:9
相关论文
共 53 条
  • [1] Integration of virtual and high-throughput screening
    Bajorath, F
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2002, 1 (11) : 882 - 894
  • [2] Metalloproteinase inhibitors: biological actions and therapeutic opportunities
    Baker, AH
    Edwards, DR
    Murphy, G
    [J]. JOURNAL OF CELL SCIENCE, 2002, 115 (19) : 3719 - 3727
  • [3] BARNARD JM, 2004, 3 JOINT SHEFF C CHEM
  • [4] Hit and lead generation:: Beyond high-throughput screening
    Bleicher, KH
    Böhm, HJ
    Müller, K
    Alanine, AI
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2003, 2 (05) : 369 - 378
  • [5] Status of HTS data mining approaches
    Böcker, A
    Schneider, G
    Teekentrup, A
    [J]. QSAR & COMBINATORIAL SCIENCE, 2004, 23 (04): : 207 - 213
  • [6] BOHM HJ, 2002, WIRKSTOFFDESIGN
  • [7] Targeting IL-1 in inflammatory disease: New opportunities for therapeutic intervention
    Braddock, M
    Quinn, A
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2004, 3 (04) : 1 - 10
  • [8] Brody T.M., 1998, Human Pharmacology: Molecular to Clinical
  • [9] Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection
    Brown, RD
    Martin, YC
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03): : 572 - 584
  • [10] CAMERON A, 1997, 6 SUBSTITUTYED AMINO