TargetSearch - a Bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data

被引:172
作者
Cuadros-Inostroza, Alvaro [1 ,2 ]
Caldana, Camila [1 ]
Redestig, Henning [1 ,3 ]
Kusano, Miyako [3 ]
Lisec, Jan [1 ]
Pena-Cortes, Hugo [2 ]
Willmitzer, Lothar [1 ]
Hannah, Matthew A. [1 ,4 ]
机构
[1] Max Planck Inst Mol Plant Physiol, D-14476 Potsdam, Germany
[2] Univ Tecn Federico Santa Maria, Ctr Biotecnol, Valparaiso, Chile
[3] RIKEN, Plant Sci Ctr, Tsurumi Ku, Kanagawa 2300045, Japan
[4] Bayer BioSci NV, B-9052 Ghent, Belgium
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
MASS-SPECTROMETRY DATA; GAS-CHROMATOGRAPHY; ARABIDOPSIS-THALIANA; SYSTEMS BIOLOGY; IDENTIFICATION; METABOLOMICS; METABONOMICS; EXTRACTION; ALGORITHM; SPECTRA;
D O I
10.1186/1471-2105-10-428
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. Results: We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. Conclusions: TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.
引用
收藏
页数:12
相关论文
共 23 条
[1]   Potential of metabolomics as a functional genomics tool [J].
Bino, RJ ;
Hall, RD ;
Fiehn, O ;
Kopka, J ;
Saito, K ;
Draper, J ;
Nikolau, BJ ;
Mendes, P ;
Roessner-Tunali, U ;
Beale, MH ;
Trethewey, RN ;
Lange, BM ;
Wurtele, ES ;
Sumner, LW .
TRENDS IN PLANT SCIENCE, 2004, 9 (09) :418-425
[2]   Robust baseline correction algorithm for signal dense NMR spectra [J].
Chang, David ;
Banack, Cory D. ;
Shah, Sirish L. .
JOURNAL OF MAGNETIC RESONANCE, 2007, 187 (02) :288-292
[3]   Metabolomics: Available results, current research projects in breast cancer, and future applications [J].
Claudino, Wederson Marcos ;
Quattrone, Alessandro ;
Biganzoli, Laura ;
Pestrin, Marta ;
Bertini, Ivano ;
Di Leo, Angelo .
JOURNAL OF CLINICAL ONCOLOGY, 2007, 25 (19) :2840-2846
[4]  
Fancy Sally-Ann, 2008, P317, DOI 10.1007/978-1-59745-463-6_15
[5]   Innovation - Metabolite profiling: from diagnostics to systems biology [J].
Fernie, AR ;
Trethewey, RN ;
Krotzky, AJ ;
Willmitzer, L .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2004, 5 (09) :763-769
[6]   Identification of uncommon plant metabolites based on calculation of elemental compositions using gas chromatography and quadrupole mass spectrometry [J].
Fiehn, O ;
Kopka, J ;
Trethewey, RN ;
Willmitzer, L .
ANALYTICAL CHEMISTRY, 2000, 72 (15) :3573-3580
[7]   Library search of mass spectra with a new matching algorithm based on substructure similarity [J].
Gan, F ;
Yang, JH ;
Liang, YZ .
ANALYTICAL SCIENCES, 2001, 17 (05) :635-638
[8]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[9]   Design of experiments:: an efficient strategy to identify factors influencing extraction and derivatization of Arabidopsis thaliana samples in metabolomic studies with gas chromatography/mass spectrometry [J].
Gullberg, J ;
Jonsson, P ;
Nordström, A ;
Sjöström, M ;
Moritz, T .
ANALYTICAL BIOCHEMISTRY, 2004, 331 (02) :283-295
[10]   GMD@CSB.DB:: the Golm Metabolome Database [J].
Kopka, J ;
Schauer, N ;
Krueger, S ;
Birkemeyer, C ;
Usadel, B ;
Bergmüller, E ;
Dörmann, P ;
Weckwerth, W ;
Gibon, Y ;
Stitt, M ;
Willmitzer, L ;
Fernie, AR ;
Steinhauser, D .
BIOINFORMATICS, 2005, 21 (08) :1635-1638