SIMILARITY SEARCHING ON CAS REGISTRY SUBSTANCES .2. 2D STRUCTURAL SIMILARITY

被引:33
作者
FISANICK, W
LIPKUS, AH
RUSINKO, A
机构
[1] Research Unit, Chemical Abstracts Service, Columbus, Ohio 43210, 2540 Olentangy River Road
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1994年 / 34卷 / 01期
关键词
D O I
10.1021/ci00017a016
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Chemical Abstracts Service (CAS) is exploring approaches for similarity (''fuzzy-match'') searching on CAS Registry substances. Experimental software is being developed to identify, analyze, and perform similarity searches on various characteristics of an integrated set of 2D, 3D, and molecular property data for samples of Registry substances. Earlier results have indicated that searching on global molecular property features such as ionization potentials and van der Waals' volumes appears to detect ''chemical'' (isosteric) similarity and that searching on generic atom triangle geometric features provides a significant amount of shape and size similarity. More recently, we have been exploring possibilities for 2D global and local structural similarity on Science and Technology Network (STN) structure files. One possible approach involves one or more fragment-based searches using the existing STN 2D substructure screen fragments, optionally followed by existing connectivity-based (atom-by-atom) search on generic structure representations of candidates obtained in the fragment-based screening step. Fragment-based searches using various screen classes such as augmented atoms and bond sequences provide for different views of 2D) structural similarity. Connectivity;based searching of generic 2D structures allows for a considerable amount of flexibility in a user's definition of similarity. This paper will discuss recent results of a comparison of the effectiveness of the various STN screen classes in fragment-based similarity searching using the Tanimoto coefficient and will illustrate the STN capabilities for connectivity-based similarity searching on an answer set.
引用
收藏
页码:130 / 140
页数:11
相关论文
共 20 条
  • [1] CLEMENTS J, 1992, J CHEM INF COMP SCI, V32, P577
  • [2] THE CAS ONLINE SEARCH SYSTEM .1. GENERAL SYSTEM-DESIGN AND SELECTION, GENERATION, AND USE OF SEARCH SCREENS
    DITTMAR, PG
    FARMER, NA
    FISANICK, W
    HAINES, RC
    MOCKUS, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1983, 23 (03): : 93 - 102
  • [3] THE CHEMICAL ABSTRACTS SERVICE GENERIC CHEMICAL (MARKUSH) STRUCTURE STORAGE AND RETRIEVAL CAPABILITY .2. THE MARPAT FILE
    EBE, T
    SANDERSON, KA
    WILSON, PS
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1991, 31 (01): : 31 - 36
  • [4] SIMILARITY SEARCHING ON CAS REGISTRY SUBSTANCES .1. GLOBAL MOLECULAR PROPERTY AND GENERIC ATOM TRIANGLE GEOMETRIC SEARCHING
    FISANICK, W
    CROSS, KP
    RUSINKO, A
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (06): : 664 - 674
  • [5] EXPERIMENTAL SYSTEM FOR SIMILARITY AND 3D SEARCHING OF CAS REGISTRY SUBSTANCES .1. 3D SUBSTRUCTURE SEARCHING
    FISANICK, W
    CROSS, KP
    FORMAN, JC
    RUSINKO, A
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1993, 33 (04): : 548 - 559
  • [6] THE CHEMICAL ABSTRACTS SERVICE GENERIC CHEMICAL (MARKUSH) STRUCTURE STORAGE AND RETRIEVAL CAPABILITY .1. BASIC CONCEPTS
    FISANICK, W
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1990, 30 (02): : 145 - 154
  • [7] Fisanick W., 1990, TETRAHEDRON COMPUT M, V3, P635
  • [8] Fisanick W, 1987, US Patent, Patent No. [US4642762, 4642762]
  • [9] FISANICK W, 1984, COMPUTER HANDLING GE, P106
  • [10] SIMILARITY CONCEPTS FOR THE PLANNING OF ORGANIC-REACTIONS AND SYNTHESES
    GASTEIGER, J
    IHLENFELDT, WD
    FICK, R
    ROSE, JR
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (06): : 700 - 712