Proposed minimum reporting standards for data analysis in metabolomics

被引:314
作者
Goodacre, Royston
Broadhurst, David
Smilde, Age K.
Kristal, Bruce S.
Baker, J. David
Beger, Richard
Bessant, Conrad
Connor, Susan
Calmani, Giorgio
Craig, Andrew
Ebbels, Tim
Kell, Douglas B.
Manetti, Cesare
Newton, Jack
Paternostro, Giovanni
Somorjai, Ray
Sjostrom, Michael
Trygg, Johan
Wulfert, Florian
机构
[1] Univ Manchester, Sch Chem, Manchester M1 7ND, Lancs, England
[2] Univ Manchester, Manchester Interdisciplinary Bioctr, Manchester M1 7ND, Lancs, England
[3] Univ Amsterdam, Swammerdam Inst Life Sci, NL-1018 WV Amsterdam, Netherlands
[4] TNO Qual Life, NL-3700 AJ Zeist, Netherlands
[5] Brigham & Womens Hosp, Dept Neurosurg, Boston, MA 02115 USA
[6] Pfizer Inc, Ann Arbor, MI USA
[7] Natl Ctr Toxicol Res, Div Syst Toxicol, Jefferson, AR 72079 USA
[8] Cranfield Univ, Bedford MK45 4DT, England
[9] GlaxoSmithKline, Safety Assessment, Ware SG12 0DP, Herts, England
[10] Univ Roma La Sapienza, Dipartimento Chim, I-00185 Rome, Italy
[11] BlueGnome Ltd, Cambridge CB2 5LD, England
[12] Univ London Imperial Coll Sci Technol & Med, Dept Biomol Med, London SW7 2AZ, England
[13] Chenomx Inc, Edmonton, AB T5K 2J1, Canada
[14] Burnham Inst Med Res, La Jolla, CA 92037 USA
[15] Natl Res Council Canada, Inst Biodiagnost, Winnipeg, MB R3B 1Y6, Canada
[16] Umea Univ, Dept Chem, Chemometr Res Grp, S-90187 Umea, Sweden
[17] Univ Nottingham, Div Food Sci, Loughborough LE12 5RD, Leics, England
基金
英国生物技术与生命科学研究理事会; 英国医学研究理事会;
关键词
chemometrics; multivariate; megavariate; unsupervised learning; supervised learning; informatics; bioinformatics; statistics; biostatistics; machine learning; statistical learning;
D O I
10.1007/s11306-007-0081-3
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
The goal of this group is to define the reporting requirements associated with the statistical analysis (including univariate, multivariate, informatics, machine learning etc.) of metabolite data with respect to other measured/collected experimental data (often called metadata). These definitions will embrace as many aspects of a complete metabolomics study as possible at this time. In chronological order this will include: Experimental Design, both in terms of sample collection/matching, and data acquisition scheduling of samples through whichever spectroscopic technology used; Deconvolution (if required); Pre-processing, for example, data cleaning, outlier detection, row/column scaling, or other transformations; Definition and parameterization of subsequent visualizations and Statistical/Machine learning Methods applied to the dataset; If required, a clear definition of the Model Validation Scheme used (including how data are split into training/validation/test sets); Formal indication on whether the data analysis has been Independently Tested (either by experimental reproduction, or blind hold out test set). Finally, data interpretation and the visual representations and hypotheses obtained from the data analyses.
引用
收藏
页码:231 / 241
页数:11
相关论文
共 70 条
[1]  
Altman D., 2000, STAT CONFIDENCE
[2]   A universal denoising and peak picking algorithm for LC-MS based on matched filtration in the chromatographic time domain [J].
Andreev, VP ;
Rejtar, T ;
Chen, HS ;
Moskovets, EV ;
Ivanov, AR ;
Karger, BL .
ANALYTICAL CHEMISTRY, 2003, 75 (22) :6314-6326
[3]  
[Anonymous], 1998, Chemometrics: A Practical Guide
[4]   Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments [J].
Baggerly, KA ;
Morris, JS ;
Coombes, KR .
BIOINFORMATICS, 2004, 20 (05) :777-U710
[5]  
Bishop CM., 1995, Neural networks for pattern recognition
[6]  
Bland M., 2000, An introduction to medical statistics
[7]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[8]   Centering and scaling in component analysis [J].
Bro, R ;
Smilde, AK .
JOURNAL OF CHEMOMETRICS, 2003, 17 (01) :16-33
[9]   Statistical strategies for avoiding false discoveries in metabolomics and related experiments [J].
Broadhurst, David I. ;
Kell, Douglas B. .
METABOLOMICS, 2006, 2 (04) :171-196
[10]   A metabolome pipeline: from concept to data to knowledge [J].
Brown, Marie ;
Dunn, Warwick B. ;
Ellis, David I. ;
Goodacre, Royston ;
Handl, Julia ;
Knowles, Joshua D. ;
O'Hagan, Steve ;
Spasic, Irena ;
Kell, Douglas B. .
METABOLOMICS, 2005, 1 (01) :39-51