Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery

被引:320
作者
Fink, Tobias [1 ]
Reymond, Jean-Louis [1 ]
机构
[1] Univ Bern, Dept Chem & Biochem, CH-3012 Bern, Switzerland
关键词
D O I
10.1021/ci600423u
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
All molecules of up to 11 atoms of C, N, O, and F possible under consideration of simple valency, chemical stability, and synthetic feasibility rules were generated and collected in a database (GDB). GDB contains 26.4 million molecules (110.9 million stereoisomers), including three- and four-membered rings and triple bonds. By comparison, only 63 857 compounds of up to 11 atoms were found in public databases (a combination of PubChem, ChemACX, ChemSCX, NCI open database, and the Merck Index). A total of 538 of the 1208 ring systems in GDB are currently unknown in the CAS Registry and Beilstein databases in any carbon/heteroatom/multiple-bond combination or as a substructure. Over 70% of GDB molecules are chiral. Because of their small size, all compounds obey Lipinski's bioavailability rule. A total of 13.2 million compounds also follow Congreve's "Rule of 3" for lead-likeness. A Kohonen map trained with autocorrelation descriptors organizes GDB according to compound classes and shows that leadlike compounds are most abundant in chiral regions of fused carbocycles and fused heterocycles. The projection of known compounds into this map indicates large uncharted areas of chemical space. The potential of GDB for drug discovery is illustrated by virtual screening for kinase inhibitors, G-protein coupled receptor ligands, and ion-channel modulators. The database is available from the author's Web page.
引用
收藏
页码:342 / 353
页数:12
相关论文
共 60 条
  • [41] Some typical advances in the synthetic applications of allenes
    Ma, SM
    [J]. CHEMICAL REVIEWS, 2005, 105 (07) : 2829 - 2871
  • [42] McKay, 1980, C NUMERANTIUM, V30, P45, DOI DOI 10.1016/J.JSC.2013.09.003
  • [43] NEW EMPIRICAL-METHOD TO CALCULATE AVERAGE MOLECULAR POLARIZABILITIES
    MILLER, KJ
    SAVCHIK, JA
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1979, 101 (24) : 7206 - 7213
  • [44] *MOL CHEM, 2005, MISCR MOL CHEM VIRT
  • [45] Strained polycycles by H5C5x free-radical cascades
    Moman, E
    Nicoletti, D
    Mouriño, A
    [J]. ORGANIC LETTERS, 2006, 8 (06) : 1249 - 1251
  • [46] MOREAU G, 1980, NOUV J CHIM, V4, P757
  • [47] Discovery of protein phosphatase inhibitor classes by biology-oriented synthesis
    Noeren-Mueller, Andrea
    Reis-Correa, Ivan, Jr.
    Prinz, Heino
    Rosenbaum, Claudia
    Saxena, Krishna
    Schwalbe, Harald J.
    Vestweber, Dietmar
    Cagna, Guiseppe
    Schunk, Stefan
    Schwarz, Oliver
    Schiewe, Hajo
    Waldmann, Herbert
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (28) : 10606 - 10611
  • [48] APPLICATIONS OF ARTIFICIAL INTELLIGENCE FOR CHEMICAL INFERENCE .29. EXHAUSTIVE GENERATION OF STEREOISOMERS FOR STRUCTURE ELUCIDATION
    NOURSE, JG
    CARHART, RE
    SMITH, DH
    DJERASSI, C
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1979, 101 (05) : 1216 - 1223
  • [49] Olah Marius M, 2004, Curr Drug Discov Technol, V1, P211, DOI 10.2174/1570163043334965
  • [50] Natural product-like chemical space: search for chemical dissectors of macromolecular interactions
    Reayi, A
    Arya, P
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2005, 9 (03) : 240 - 247