An automated method for peak detection and matching in large gas chromatography-mass spectrometry data sets

被引:46
作者
Dixon, Sarah J.
Brereton, Richard G.
Soini, Helena A.
Novotny, Milos V.
Penn, Dustin J.
机构
[1] Univ Bristol, Sch Chem, Ctr Chemometr, Bristol BS8 1TS, Avon, England
[2] Indiana Univ, Dept Chem, Bloomington, IN 47405 USA
[3] Indiana Univ, Inst Pheromone Res, Bloomington, IN 47405 USA
[4] Austrian Acad Sci, Konrad Lorenz Inst Ethol, A-1160 Vienna, Austria
关键词
GC-MS; peak detection; peak matching; metabolomics;
D O I
10.1002/cem.1005
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new approach for peak detection and matching has been developed and applied to two data sets. The first consisted of the Gas Chromatography-Mass Spectrometry (GC-MS) samples of 965 human sweat samples obtained from a population of 197 individuals. The second data set contained 500 synthetic chromatograms, and was generated to validate the peak detection and matching methods. The size of both of the data sets (around 500 000 detectable peaks over all chromatograms in data set 1, and around 100 000 in data set 2) would make it unfeasible to check manually whether peaks are matched. In the method described, the first procedure involves pre-processing the data before carrying out the second procedure of peak detection. The final procedure of peak matching consists of three stages: (a) finding potential target peaks in the full data set over all chromatograms; (b) matching peaks in the chromatograms to these targets to form clusters of spectra associated with each target; (c) merging targets where appropriate. Peak detection and matching were applied to both data sets, and the importance of stage (c) of peak matching described. In addition to the analysis of the synthetic chromatograms, the method was also validated by shuffling the original order of the sweat chromatograrns and performing the methods independently on the newly shuffled data. Copyright (c) 2007 John Wiley & Sons, Ltd.
引用
收藏
页码:325 / 340
页数:16
相关论文
共 38 条
[1]   SIMPLEX FOCUSING OF RETENTION TIMES AND LATENT VARIABLE PROJECTIONS OF CHROMATOGRAPHIC PROFILES [J].
ANDERSSON, R ;
HAMALAINEN, MD .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1994, 22 (01) :49-61
[2]   Sorptive sample preparation - a review [J].
Baltussen, E ;
Cramers, CA ;
Sandra, PJF .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2002, 373 (1-2) :3-22
[3]  
Brereton R.G., 2003, CHEMOMETRICS DATA AN, P132
[4]   Determination of volatile oak compounds in wine by headspace solid-phase microextraction and gas chromatography-mass spectrometry [J].
Carrillo, JD ;
Garrido-López, A ;
Tena, MT .
JOURNAL OF CHROMATOGRAPHY A, 2006, 1102 (1-2) :25-36
[5]   Chromatographic preprocessing of GC-MS data for analysis of complex chemical mixtures [J].
Christensen, JH ;
Mortensen, J ;
Hansen, AB ;
Andersen, O .
JOURNAL OF CHROMATOGRAPHY A, 2005, 1062 (01) :113-123
[6]   Determination of cocaine contamination on banknotes using tandem mass spectrometry and pattern recognition [J].
Dixon, SJ ;
Brereton, RG ;
Carter, JF ;
Sleeman, R .
ANALYTICA CHIMICA ACTA, 2006, 559 (01) :54-63
[7]  
DIXON SJ, IN PRESS CHEMOM INTE
[8]   DE-NOISING BY SOFT-THRESHOLDING [J].
DONOHO, DL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1995, 41 (03) :613-627
[9]   Metabolomics spectral formatting, alignment and conversion tools (MSFACTs) [J].
Duran, AL ;
Yang, J ;
Wang, LJ ;
Sumner, LW .
BIOINFORMATICS, 2003, 19 (17) :2283-2293
[10]   Toxicological evaluation of complex mixtures: fingerprinting and multivariate analysis [J].
Eide, I ;
Neverdal, G ;
Thorvaldsen, B ;
Arneberg, R ;
Grung, B ;
Kvalheim, OM .
ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY, 2004, 18 (02) :127-133