Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery

被引:320
作者
Fink, Tobias [1 ]
Reymond, Jean-Louis [1 ]
机构
[1] Univ Bern, Dept Chem & Biochem, CH-3012 Bern, Switzerland
关键词
D O I
10.1021/ci600423u
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
All molecules of up to 11 atoms of C, N, O, and F possible under consideration of simple valency, chemical stability, and synthetic feasibility rules were generated and collected in a database (GDB). GDB contains 26.4 million molecules (110.9 million stereoisomers), including three- and four-membered rings and triple bonds. By comparison, only 63 857 compounds of up to 11 atoms were found in public databases (a combination of PubChem, ChemACX, ChemSCX, NCI open database, and the Merck Index). A total of 538 of the 1208 ring systems in GDB are currently unknown in the CAS Registry and Beilstein databases in any carbon/heteroatom/multiple-bond combination or as a substructure. Over 70% of GDB molecules are chiral. Because of their small size, all compounds obey Lipinski's bioavailability rule. A total of 13.2 million compounds also follow Congreve's "Rule of 3" for lead-likeness. A Kohonen map trained with autocorrelation descriptors organizes GDB according to compound classes and shows that leadlike compounds are most abundant in chiral regions of fused carbocycles and fused heterocycles. The projection of known compounds into this map indicates large uncharted areas of chemical space. The potential of GDB for drug discovery is illustrated by virtual screening for kinase inhibitors, G-protein coupled receptor ligands, and ion-channel modulators. The database is available from the author's Web page.
引用
收藏
页码:342 / 353
页数:12
相关论文
共 60 条
  • [1] CONFORMATIONAL-ANALYSIS .130. MM2 - HYDROCARBON FORCE-FIELD UTILIZING V1 AND V2 TORSIONAL TERMS
    ALLINGER, NL
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1977, 99 (25) : 8127 - 8134
  • [2] MOLECULAR MECHANICS - THE MM3 FORCE-FIELD FOR HYDROCARBONS .1.
    ALLINGER, NL
    YUH, YH
    LII, JH
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1989, 111 (23) : 8551 - 8566
  • [3] [Anonymous], 2005, JCHEM VERS 3 1
  • [4] [Anonymous], 2005, MARV VERS 4 0
  • [5] Small-molecule natural products: new structures, new activities
    Baker, DD
    Alvi, KA
    [J]. CURRENT OPINION IN BIOTECHNOLOGY, 2004, 15 (06) : 576 - 583
  • [6] Balaban A.T., 1976, CHEM APPL GRAPH THEO
  • [7] Locating biologically active compounds in medium-sized heterogeneous datasets by topological autocorrelation vectors: Dopamine and benzodiazepine agonists
    Bauknecht, H
    Zell, A
    Bayer, H
    Levi, P
    Wagener, M
    Sadowski, J
    Gasteiger, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (06): : 1205 - 1213
  • [8] MOLGEN(+), A GENERATOR OF CONNECTIVITY ISOMERS AND STEREOISOMERS FOR MOLECULAR-STRUCTURE ELUCIDATION
    BENECKE, C
    GRUND, R
    HOHBERGER, R
    KERBER, A
    LAUE, R
    WIELAND, T
    [J]. ANALYTICA CHIMICA ACTA, 1995, 314 (03) : 141 - 147
  • [9] Hit and lead generation:: Beyond high-throughput screening
    Bleicher, KH
    Böhm, HJ
    Müller, K
    Alanine, AI
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2003, 2 (05) : 369 - 378
  • [10] Bohacek RS, 1996, MED RES REV, V16, P3, DOI 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.3.CO