Peak Aggregation as an Innovative Strategy for Improving the Predictive Power of LC-MS Metabolomic Profiles

被引:10
作者
Fernandez-Albert, Francesc [1 ,2 ,3 ]
Llorach, Rafael [2 ]
Andres-Lacueva, Cristina [2 ]
Perera-Lluna, Alexandre [1 ,3 ]
机构
[1] Univ Politecn Cataluna, ESAII Dept, B2SLab, Barcelona, Spain
[2] Univ Barcelona, Sch Pharm, INGENIO CONSOLIDER Program, Dept Nutr & Food Sci,Biomarkers & Nutrimetabol La, Barcelona, Spain
[3] CIBER Bioengn Biomat & Nanomed CIBER BBN, Barcelona, Spain
关键词
MASS-SPECTROMETRY; INFORMATION; SPECTRA;
D O I
10.1021/ac403702p
中图分类号
O65 [分析化学];
学科分类号
070302 [分析化学];
摘要
Liquid chromatography-mass spectrometry (LC-MS)-based metabolomic datasets consist of different features including (de)protonated molecules, fragments, adducts, and isotopes that may show high correlation values related to a high level of collinearity. There have been described several sources of these high correlation patterns regarding metabolomic datasets; Among these sources, it should be highlighted the high level of correlation computed between features coming from the same metabolite. It is well-known that soft ionization methods (such as elcctrospray) produce several mass features from a particular compound (i.e., metabolite spectrum). Typically, the statistical methods used in metabolomics consider spectral peaks as variables. However, it has been reported that a high collinearity between variables might be the responsible for high uncertainty values in the predictors of a regression. In this context, this technical note proposes a new strategy based on the application of the so-called peak aggregation methods (NMF Reduction, PCA Decomposition, Maximum Peak, and Spectrum Mean) to take advantage of the variable collinearity and solve the issue of high variable collinearity. A set of real samples obtained after human nutritional intervention with placebo or polyphenol-rich beverages was used to test this methodology. The results showed that applying any peak aggregation method (especially NMF and PCA) improves the statistical prediction power of class pertinence independently of the nature of the classifier (linear PLS-DA or nonlinear SVM). Overall, the introduction of this new approach resulted in a reduction of the dimensionality of the data and, in addition, in a significant increase in the overall predictive power of the data.
引用
收藏
页码:2320 / 2325
页数:6
相关论文
共 22 条
[1]
[Anonymous], J PROTEOME RES
[2]
[Anonymous], J CHEMOM
[3]
[Anonymous], METABOLOMICS
[4]
Algorithms and applications for approximate nonnegative matrix factorization [J].
Berry, Michael W. ;
Browne, Murray ;
Langville, Amy N. ;
Pauca, V. Paul ;
Plemmons, Robert J. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) :155-173
[5]
Metagenes and molecular pattern discovery using matrix factorization [J].
Brunet, JP ;
Tamayo, P ;
Golub, TR ;
Mesirov, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (12) :4164-4169
[6]
The origin of correlations in metabolomics data [J].
Camacho, Diogo ;
de la Fuente, Alberto ;
Mendes, Pedro .
METABOLOMICS, 2005, 1 (01) :53-63
[7]
Matched filtering with background suppression for improved quality of base peak chromatograms and mass spectra in liquid chromatography-mass spectrometry [J].
Danielsson, R ;
Bylund, D ;
Markides, KE .
ANALYTICA CHIMICA ACTA, 2002, 454 (02) :167-184
[8]
Hastie T., 2003, The Elements of Statistical Learning: Data Mining, Inference, and Prediction
[9]
Processing methods for differential analysis of LC/MS profile data [J].
Katajamaa, M ;
Oresic, M .
BMC BIOINFORMATICS, 2005, 6 (1)
[10]
CAMERA: An Integrated Strategy for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass Spectrometry Data Sets [J].
Kuhl, Carsten ;
Tautenhahn, Ralf ;
Boettcher, Christoph ;
Larson, Tony R. ;
Neumann, Steffen .
ANALYTICAL CHEMISTRY, 2012, 84 (01) :283-289