Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection

被引:103
作者
Li, Zhucui [1 ,2 ,3 ]
Lu, Yan [1 ,2 ,4 ]
Guo, Yufeng [3 ]
Cao, Haijie [5 ]
Wang, Qinhong [3 ]
Shui, Wenqing [2 ,4 ]
机构
[1] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[2] ShanghaiTech Univ, iHuman Inst, Shanghai 201210, Peoples R China
[3] Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Tianjin 300308, Peoples R China
[4] ShanghaiTech Univ, Sch Life Sci & Technol, Shanghai 201210, Peoples R China
[5] Nankai Univ, Coll Pharm, Tianjin 300071, Peoples R China
基金
中国国家自然科学基金;
关键词
Untargeted metabolomics; Data processing software; Feature detection; Feature quantification; Discriminating marker selection; SPECTROMETRY-BASED METABOLOMICS; MASS-SPECTROMETRY; MISSING VALUES; DATA SET; DISCOVERY; PERFORMANCE; METABOLISM; WORKFLOW; PLATFORM; URINE;
D O I
10.1016/j.aca.2018.05.001
中图分类号
O65 [分析化学];
学科分类号
070302 [分析化学];
摘要
Data analysis represents a key challenge for untargeted metabolomics studies and it commonly requires extensive processing of more than thousands of metabolite peaks included in raw high-resolution MS data. Although a number of software packages have been developed to facilitate untargeted data processing, they have not been comprehensively scrutinized in the capability of feature detection, quantification and marker selection using a well-defined benchmark sample set. In this study, we acquired a benchmark dataset from standard mixtures consisting of 1100 compounds with specified concentration ratios including 130 compounds with significant variation of concentrations. Five software evaluated here (MS-Dial, MZmine 2, XCMS, MarkerView, and Compound Discoverer) showed similar performance in detection of true features derived from compounds in the mixtures. However, significant differences between untargeted metabolomics software were observed in relative quantification of true features in the benchmark dataset. MZmine 2 outperformed the other software in terms of quantification accuracy and it reported the most true discriminating markers together with the fewest false markers. Furthermore, we assessed selection of discriminating markers by different software using both the benchmark dataset and a real-case metabolomics dataset to propose combined usage of two software for increasing confidence of biomarker identification. Our findings from comprehensive evaluation of untargeted metabolomics software would help guide future improvements of these widely used bioinformatics tools and enable users to properly interpret their metabolomics results. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:50 / 57
页数:8
相关论文
共 34 条
[1]
Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics [J].
Cajka, Tomas ;
Fiehn, Oliver .
ANALYTICAL CHEMISTRY, 2016, 88 (01) :524-545
[2]
MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments [J].
Choi, Meena ;
Chang, Ching-Yun ;
Clough, Timothy ;
Broudy, Daniel ;
Killeen, Trevor ;
MacLean, Brendan ;
Vitek, Olga .
BIOINFORMATICS, 2014, 30 (17) :2524-2526
[3]
Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery [J].
Coble, Jamie B. ;
Fraga, Carlos G. .
JOURNAL OF CHROMATOGRAPHY A, 2014, 1358 :155-164
[4]
Mass Spectral Feature List Optimizer (MS-FLO): A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography-Mass Spectroscopy (LC-MS) Data Processing [J].
DeFelice, Brian C. ;
Mehta, Sajjan Singh ;
Samra, Stephanie ;
Cajka, Tomas ;
Wancewicz, Benjamin ;
Fahrmann, Johannes F. ;
Fiehn, Oliver .
ANALYTICAL CHEMISTRY, 2017, 89 (06) :3250-3255
[5]
Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling [J].
Di Guida, Riccardo ;
Engel, Jasper ;
Allwood, J. William ;
Weber, Ralf J. M. ;
Jones, Martin R. ;
Sommer, Ulf ;
Viant, Mark R. ;
Dunn, Warwick B. .
METABOLOMICS, 2016, 12 (05)
[6]
Novel Chemical Ligands to Ebola Virus and Marburg Virus Nucleoproteins Identified by Combining Affinity Mass Spectrometry and Metabolomics Approaches [J].
Fu, Xu ;
Wang, Zhihua ;
Li, Lixin ;
Dong, Shishang ;
Li, Zhucui ;
Jiang, Zhenzuo ;
Wang, Yuefei ;
Shui, Wenqing .
SCIENTIFIC REPORTS, 2016, 6
[7]
Data analysis strategies for targeted and untargeted LC-MS metabolomic studies: Overview and workflow [J].
Gorrochategui, Eva ;
Jaumot, Joaquim ;
Lacorte, Silvia ;
Tauler, Roma .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2016, 82 :425-442
[8]
Influence of Missing Values Substitutes on Multivariate Analysis of Metabolomics Data [J].
Gromski, Piotr S. ;
Xu, Yun ;
Kotze, Helen L. ;
Correa, Elon ;
Ellis, David I. ;
Armitage, Emily Grace ;
Turner, Michael L. ;
Goodacre, Royston .
METABOLITES, 2014, 4 (02) :433-452
[9]
The Effect of LC-MS Data Preprocessing Methods on the Selection of Plasma Biomarkers in Fed vs. Fasted Rats [J].
Guerdeniz, Goezde ;
Kristensen, Mette ;
Skov, Thomas ;
Dragsted, Lars O. .
METABOLITES, 2012, 2 (01) :77-99
[10]
Counting Missing Values in a Metabolite-Intensity Data Set for Measuring the Analytical Performance of a Metabolomics Platform [J].
Huan, Tao ;
Li, Liang .
ANALYTICAL CHEMISTRY, 2015, 87 (02) :1306-1313