Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction

被引:104
作者
Kuhn, Stefan [1 ,2 ]
Egert, Bjoern [2 ]
Neumann, Steffen [2 ]
Steinbeck, Christoph [1 ,3 ]
机构
[1] CUBIC, D-50674 Cologne, Germany
[2] Leibniz Inst Plant Biochem, Dept Stress & Dev Biol, D-06120 Halle, Germany
[3] EBI, Cambridge CB10 1SD, England
关键词
D O I
10.1186/1471-2105-9-400
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer- Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB. Results: A mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error. Conclusion: NMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites.
引用
收藏
页数:19
相关论文
共 29 条
[1]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[2]   Prediction of 1H NMR chemical shifts using neural networks [J].
Aires-de-Sousa, J ;
Hemmer, MC ;
Gasteiger, J .
ANALYTICAL CHEMISTRY, 2002, 74 (01) :80-90
[3]  
[Anonymous], 2006, R LANG ENV STAT COMP
[4]  
BEIERLE C, 2003, METHODEN WISSENSBASI
[5]   The impact of available experimental data on the prediction of 1H NMR chemical shifts by neural networks [J].
Binev, Y ;
Corvo, M ;
Aires-De-Sousa, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :946-949
[6]   Structure-based predictions of 1H NMR chemical shifts using feed-forward neural networks [J].
Binev, Y ;
Aires-De-Sousa, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :940-945
[7]  
BLINOV K, 2008, J CHEM INFORM MODELI
[8]  
BOERNER J, 2007, ANAL BIOCHEM, V367, P143
[9]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32