Molecular similarity analysis and virtual screening by mapping of consensus positions in binary-transformed chemical descriptor spaces with variable dimensionality

被引:34
作者
Godden, JW
Furr, JR
Xue, L
Stahura, FL
Bajorath, J
机构
[1] Albany Mol Res Inst Inc, Dept Comp Aided Drug Discovery, BRC, Bothell, WA 98011 USA
[2] Univ Washington, Dept Biol Struct, Seattle, WA 98195 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2004年 / 44卷 / 01期
关键词
D O I
10.1021/ci0302963
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A novel compound classification algorithm is described that operates in binary molecular descriptor spaces and groups active compounds together in a computationally highly efficient manner. The method involves the transformation of continuous descriptor value ranges into a binary format, subsequent definition of simplified descriptor spaces, identification of consensus positions of specific compound sets in these spaces, and iterative adjustments of the dimensionality of the descriptor spaces in order to discriminate compounds sharing similar activity from others. We term this approach Dynamic Mapping of Consensus positions (DMC) because the definition of reference spaces is tuned toward specific compound classes and their dimensionality is increased as the analysis proceeds. When applied to virtual screening, sets of bait compounds are added to a large screening database to identify hidden active molecules. In these calculations, molecules that map to consensus positions after elimination of most of the database compounds are considered hit candidates. In a benchmark study on five biological activity classes, hits for randomly assembled sets of bait molecules were correctly identified in 95% of virtual screening calculations in a source database containing more than 1.3 million molecules, thus providing a measure of the sensitivity of the DMC technique.
引用
收藏
页码:21 / 29
页数:9
相关论文
共 30 条
[1]   Nonlinear mapping networks [J].
Agrafiotis, DK ;
Lobanov, VS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (06) :1356-1362
[2]   Multidimensional scaling of combinatorial libraries without explicit enumeration [J].
Agrafiotis, DK ;
Lobanov, VS .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2001, 22 (14) :1712-1722
[3]  
[Anonymous], MOE MOL OP ENV VERS
[4]   Integration of virtual and high-throughput screening [J].
Bajorath, F .
NATURE REVIEWS DRUG DISCOVERY, 2002, 1 (11) :882-894
[5]   Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening [J].
Bajorath, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02) :233-245
[6]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[7]   Estimation of molecular similarity based on 4D-QSAR analysis: Formalism and validation [J].
Duca, JS ;
Hopfinger, AJ .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (05) :1367-1387
[8]   Identification of biological activity profiles using substructural analysis and genetic algorithms [J].
Gillet, VJ ;
Willett, P ;
Bradshaw, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (02) :165-179
[9]   Recursive median partitioning for virtual screening of large databases [J].
Godden, JW ;
Furr, JR ;
Bajorath, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (01) :182-188
[10]   Chemical descriptors with distinct levels of information content and varying sensitivity to differences between selected compound databases identified by SE-DSE analysis [J].
Godden, JW ;
Bajorath, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (01) :87-93