Investigation of classification methods for the prediction of activity in diverse chemical libraries

被引:34
作者
Dixon, SL [1 ]
Villar, HO [1 ]
机构
[1] Telik Inc, S San Francisco, CA 94080 USA
关键词
chemoinformatics; classification; cluster analysis; discriminant analysis; recursive partitioning; topological descriptors; 2D structural keys;
D O I
10.1023/A:1008061017938
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Classification methods based on linear discriminant analysis, recursive partitioning, and hierarchical agglomerative clustering are examined for their ability to separate active and inactive compounds in a diverse chemical database. Topology-based descriptions of chemical structure from the Molconn-X and ISIS programs are used in conjunction with these classification techniques to identify ACE inhibitors, beta-adrenergic antagonists, and H-2 receptor antagonists. Overall, discriminant analysis misclassifies the smallest number of active compounds, while recursive partitioning yields the lowest rate of misclassification among inactives. Binary structural keys from the ISIS package are found to generally outperform the whole-molecule Molconn-X descriptors, especially for identification of inactive compounds. For all targets and classification methods, sensitivity toward active compounds is increased by making repetitive classifications using training sets that contain equal numbers of actives and inactives. These balanced training sets provide an average numerical class membership score which may be used to select subsets of compounds that are enriched in actives.
引用
收藏
页码:533 / 545
页数:13
相关论文
共 33 条
  • [21] McFarland J. W., 1990, COMPREHENSIVE MED CH, V4, P667
  • [22] *MDL INF SYST, 1994, MACCS 2 MEN REF VERS
  • [23] *MDL INF SYST INC, ISISTM BAS 2 1 3
  • [24] *MDL INF SYST INC, CMC DAT
  • [25] MURTAGH F, 1985, MULTIDIMENSIONAL CLU, V6
  • [26] Neighborhood behavior: A useful concept for validation of ''molecular diversity'' descriptors
    Patterson, DE
    Cramer, RD
    Ferguson, AM
    Clark, RD
    Weinberger, LE
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (16) : 3049 - 3059
  • [27] Serendipity meets precision: The integration of structure-based drug design and combinatorial chemistry for efficient drug discovery
    Salemme, FR
    Spurlino, J
    Bone, R
    [J]. STRUCTURE, 1997, 5 (03) : 319 - 324
  • [28] ENHANCING THE DIVERSITY OF A CORPORATE DATABASE USING CHEMICAL DATABASE CLUSTERING AND ANALYSIS
    SHEMETULSKIS, NE
    DUNBAR, JB
    DUNBAR, BW
    MORELAND, DW
    HUMBLET, C
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1995, 9 (05) : 407 - 416
  • [29] AUTOMATED DESCRIPTOR SELECTION FOR QUANTITATIVE STRUCTURE-ACTIVITY-RELATIONSHIPS USING GENERALIZED SIMULATED ANNEALING
    SUTTER, JM
    DIXON, SL
    JURS, PC
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1995, 35 (01): : 77 - 84
  • [30] VANDEWATERBEEMD H, 1995, CHEMOMETRIC METHODS, P283