Predicting physiological concentrations of metabolites from their molecular structure

被引:7
作者
Liebermeister, W [1 ]
机构
[1] Max Planck Inst Mol Genet, Kinet Modelling Grp, D-14195 Berlin, Germany
关键词
metabolite concentration; QSPR; molecule structure; lasso regression;
D O I
10.1089/cmb.2005.12.1307
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Physiological concentrations of metabolites can partly be explained by their molecular structure. We hypothesize that substances containing certain chemical groups show increased or decreased concentration in cells. We consider here, as chemical groups, local atomic configurations, describing an atom, its bonds, and its direct neighbor atoms. To test our hypothesis, we fitted a linear statistical model that relates experimentally determined logarithmic concentrations to feature vectors containing count numbers of the chemical groups. In order to determine chemical groups that have a clear effect on the concentration, we use a regularized (lasso) regression. In a dataset on 41 substances of central metabolism in different organisms, we found that the physical concentrations are increased by the occurrence of amino and hydroxyl groups, while aldehydes, ketones, and phosphates show decreased concentrations. The model explains about 22% of the variance of the logarithmic mean concentrations.
引用
收藏
页码:1307 / 1315
页数:9
相关论文
共 7 条
[1]   CELLULAR CONCENTRATIONS OF ENZYMES AND THEIR SUBSTRATES [J].
ALBE, KR ;
BUTLER, MH ;
WRIGHT, BE .
JOURNAL OF THEORETICAL BIOLOGY, 1990, 143 (02) :163-195
[2]   Computational methods for the prediction of 'drug-likeness' [J].
Clark, DE ;
Pickett, SD .
DRUG DISCOVERY TODAY, 2000, 5 (02) :49-58
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]  
Friedman J., 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[5]   Metabolomics by numbers: acquiring and understanding global metabolite data [J].
Goodacre, R ;
Vaidyanathan, S ;
Dunn, WB ;
Harrigan, GG ;
Kell, DB .
TRENDS IN BIOTECHNOLOGY, 2004, 22 (05) :245-252
[6]   The KEGG databases at GenomeNet [J].
Kanehisa, M ;
Goto, S ;
Kawashima, S ;
Nakaya, A .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :42-46
[7]  
Öjelund H, 2001, J CHEMOMETR, V15, P497