Molecular hashkeys: A novel method for molecular characterization and its application for predicting important pharmaceutical properties of molecules

被引:49
作者
Ghuloum, AM [1 ]
Sage, CR [1 ]
Jain, AN [1 ]
机构
[1] MetaXen, S San Francisco, CA 94080 USA
关键词
D O I
10.1021/jm980527a
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We define a novel numerical molecular representation, called the molecular hashkey, that captures sufficient information about a molecule to predict pharmaceutically interesting properties directly from three-dimensional molecular structure. The molecular hashkey represents molecular surface properties as a linear array of pairwise surface-based comparisons of the target molecule against a common 'basis-set' of molecules. Hashkey-measured molecular similarity correlates well with direct methods of measuring molecular surface similarity. Using a simple machine-learning technique with the molecular hashkeys, we show that it is possible to accurately predict the octanol-water partition coefficient, log P. Using more sophisticated learning techniques, we show that an accurate model of intestinal absorption for a set of drugs can be constructed using the same hashkeys used in the aforementioned experiments. Once a set of molecular hashkeys is calculated, its use in the training and testing of property-based models is very fast. Further, the required amount of data for model construction is very small. Neural network-based hashkey models trained on data sets as small as 30 molecules yield statistically significant prediction of molecular properties. The lack of a requirement for large data sets lends itself well to the prediction of pharmaceutically relevant molecular parameters for which data generation is expensive and slow. Molecular hashkeys coupled with machine-learning techniques can yield models that predict key pharmacological aspects of biologically important molecules and should therefore be important in the design of effective therapeutics.
引用
收藏
页码:1739 / 1748
页数:10
相关论文
共 27 条