ChemmineR: a compound mining framework for R

被引:262
作者
Cao, Yiqun [2 ]
Charisi, Anna [1 ]
Cheng, Li-Chang [2 ]
Jiang, Tao [2 ]
Girke, Thomas [1 ]
机构
[1] Univ Calif Riverside, Dept Bot & Plant Sci, Riverside, CA 92521 USA
[2] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/btn307
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Software applications for structural similarity searching and clustering of small molecules play an important role in drug discovery and chemical genomics. Here, we present the first open-source compound mining framework for the popularstatistical programming environment R. The integration with a powerful statistical environment maximizes the flexibility, expandability and programmability of the provided analysis functions. Results: We discuss the algorithms and compound mining utilities provided by the R package ChemmineR. It contains functions for structural similarity searching, clustering of compound libraries with a wide spectrum of classification algorithms and various utilities for managing complex compound data. It also offers a wide range of visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine environment and allows bidirectional communications between the two services.
引用
收藏
页码:1733 / 1734
页数:2
相关论文
共 14 条
[1]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[2]   ChemDB: a public database of small molecules and related chemoinformatics resources [J].
Chen, J ;
Swamidass, SJ ;
Bruand, J ;
Baldi, P .
BIOINFORMATICS, 2005, 21 (22) :4133-4139
[3]   Performance of similarity measures in 2D fragment-based similarity searching: Comparison of structural descriptors and similarity coefficients [J].
Chen, X ;
Reynolds, CH .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (06) :1407-1414
[4]   QSAR - How good is it in practice? Comparison of descriptor sets on an unbiased cross section of corporate data sets [J].
Gedeck, Peter ;
Rohde, Bernhard ;
Bartels, Christian .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (05) :1924-1936
[5]  
Gentleman R., 2005, Bioinformatics and computational biology solutions using R and Bioconductor, V1
[6]   ChemMine. A compound mining database for chemical genomics [J].
Girke, T ;
Cheng, LC ;
Raikhel, N .
PLANT PHYSIOLOGY, 2005, 138 (02) :573-577
[7]   Blue Obelisk - Interoperability in chemical informatics [J].
Guha, Rajarshi ;
Howard, Michael T. ;
Hutchison, Geoffrey R. ;
Murray-Rust, Peter ;
Rzepa, Henry ;
Steinbeck, Christoph ;
Wegner, Jorg ;
Willighagen, Egon L. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (03) :991-998
[8]   Analysis and display of the size dependence of chemical similarity coefficients [J].
Holliday, JD ;
Salim, N ;
Whittle, M ;
Willett, P .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (03) :819-828
[9]   ZINC - A free database of commercially available compounds for virtual screening [J].
Irwin, JJ ;
Shoichet, BK .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (01) :177-182
[10]  
LANG DT, 2007, RGGOBI INTERFACE R G