Prediction of oxidoreductase-catalyzed reactions based on atomic properties of metabolites

被引:16
作者
Mu, Fangping
Unkefer, Pat J.
Unkefer, Clifford J.
Hlavacek, William S. [1 ]
机构
[1] Los Alamos Natl Lab, Theoret Biol & Biophys Grp, Div Theoret, Los Alamos, NM 87545 USA
[2] Los Alamos Natl Lab, Biosci Div, Los Alamos, NM 87545 USA
[3] Los Alamos Natl Lab, Ctr Nonlinear Studies, Los Alamos, NM 87545 USA
关键词
D O I
10.1093/bioinformatics/btl535
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Our knowledge of metabolism is far from complete, and the gaps in our knowledge are being revealed by metabolomic detection of small-molecules not previously known to exist in cells. An important challenge is to determine the reactions in which these compounds participate, which can lead to the identification of gene products responsible for novel metabolic pathways. To address this challenge, we investigate how machine learning can be used to predict potential substrates and products of oxidoreductase-catalyzed reactions. Results: We examined 1956 oxidation/reduction reactions in the KEGG database. The vast majority of these reactions (1626) can be divided into 12 subclasses, each of which is marked by a particular type of functional group transformation. For a given transformation, the local structures of reaction centers in substrates and products can be characterized by patterns. These patterns are not unique to reactants but are widely distributed among KEGG metabolites. To distinguish reactants from non-reactants, we trained classifiers (linear-kernel Support Vector Machines) using negative and positive examples. The input to a classifier is a set of atomic features that can be determined from the 2D chemical structure of a compound. Depending on the subclass of reaction, the accuracy of prediction for positives (negatives) is 64 to 93% (44 to 92%) when asking if a compound is a substrate and 71 to 98% (50 to 92%) when asking if a compound is a product. Sensitivity analysis reveals that this performance is robust to variations of the training data. Our results suggest that metabolic connectivity can be predicted with reasonable accuracy from the presence or absence of local structural motifs in compounds and their readily calculated atomic features. Availability: Classifiers reported here can be used freely for noncommercial purposes via a Java program available upon request. Contact: wish@lanl.gov Supplementary information: Supplementary data are available at Bioinformatics online.
引用
收藏
页码:3082 / 3088
页数:7
相关论文
共 38 条
[1]   Bridging cheminformatic metabolite prediction and tandem mass spectrometry [J].
Anari, MR ;
Baillie, TA .
DRUG DISCOVERY TODAY, 2005, 10 (10) :711-717
[2]   Precision mapping of the metabolome [J].
Breitling, Rainer ;
Pitt, Andrew R. ;
Barrett, Michael P. .
TRENDS IN BIOTECHNOLOGY, 2006, 24 (12) :543-548
[3]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4]   Modeling enzyme reactivity in organic solvents and water through computer simulations [J].
Colombo, G ;
Carrea, G .
JOURNAL OF BIOTECHNOLOGY, 2002, 96 (01) :23-33
[5]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[6]   PREDICTING METABOLIC PATHWAYS BY LOGIC PROGRAMMING [J].
DARVAS, F .
JOURNAL OF MOLECULAR GRAPHICS, 1988, 6 (02) :80-86
[7]   Measuring the metabolome: current analytical technologies [J].
Dunn, WB ;
Bailey, NJC ;
Johnson, HE .
ANALYST, 2005, 130 (05) :606-625
[8]   Innovation - Metabolite profiling: from diagnostics to systems biology [J].
Fernie, AR ;
Trethewey, RN ;
Krotzky, AJ ;
Willmitzer, L .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2004, 5 (09) :763-769
[9]   Metabolomics - the link between genotypes and phenotypes [J].
Fiehn, O .
PLANT MOLECULAR BIOLOGY, 2002, 48 (1-2) :155-171
[10]   Reconstructing the metabolic network of a bacterium from its genome [J].
Francke, C ;
Siezen, RJ ;
Teusink, B .
TRENDS IN MICROBIOLOGY, 2005, 13 (11) :550-558