Data integration and network reconstruction with ∼omics data using Random Forest regression in potato

被引:44
作者
Acharjee, Animesh [1 ,2 ]
Kloosterman, Bjorn [1 ]
de Vos, Ric C. H. [3 ,4 ]
Werij, Jeroen S. [1 ,3 ]
Bachem, Christian W. B. [1 ]
Visser, Richard G. F. [1 ,3 ]
Maliepaard, Chris [1 ]
机构
[1] Univ Wageningen & Res Ctr, Wageningen UR Plant Breeding, NL-6700 AJ Wageningen, Netherlands
[2] Grad Sch Expt Plant Sci, Wageningen, Netherlands
[3] Ctr BioSyst Genom, NL-6700 AA Wageningen, Netherlands
[4] Plant Res Int, NL-6700 AA Wageningen, Netherlands
关键词
Data integration; Random Forest; Network reconstruction; Tuber flesh color; Potato; SYSTEMS BIOLOGY; METABOLOMICS; QTL; BIOMARKERS; SELECTION; SET;
D O I
10.1016/j.aca.2011.03.050
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In the post-genomic era, high-throughput technologies have led to data collection in fields like transcriptomics, metabolomics and proteomics and, as a result, large amounts of data have become available. However, the integration of these similar to omics data sets in relation to phenotypic traits is still problematic in order to advance crop breeding. We have obtained population-wide gene expression and metabolite (LC-MS) data from tubers of a diploid potato population and present a novel approach to study the various similar to omics datasets to allow the construction of networks integrating gene expression, metabolites and phenotypic traits. We used Random Forest regression to select subsets of the metabolites and transcripts which show association with potato tuber flesh color and enzymatic discoloration. Network reconstruction has led to the integration of known and uncharacterized metabolites with genes associated with the carotenoid biosynthesis pathway. We show that this approach enables the construction of meaningful networks with regard to known and unknown components and metabolite pathways. Crown Copyright (C) 2011 Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:56 / 63
页数:8
相关论文
共 42 条
[11]   Temporal dynamics of tuber formation and related processes in a crossing population of potato (Solanum tuberosum) [J].
Celis-Gamboa, C ;
Struik, PC ;
Jacobsen, E ;
Visser, RGF .
ANNALS OF APPLIED BIOLOGY, 2003, 143 (02) :175-186
[12]   An introduction to markers, quantitative trait loci (QTL) mapping and marker-assisted selection for crop improvement: The basic concepts [J].
Collard, BCY ;
Jahufer, MZZ ;
Brouwer, JB ;
Pang, ECK .
EUPHYTICA, 2005, 142 (1-2) :169-196
[13]  
de Rigal D, 2000, J SCI FOOD AGR, V80, P763, DOI 10.1002/(SICI)1097-0010(20000501)80:6<763::AID-JSFA623>3.0.CO
[14]  
2-U
[15]   Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry [J].
De Vos, Ric C. H. ;
Moco, Sofia ;
Lommen, Arjen ;
Keurentjes, Joost J. B. ;
Bino, Raoul J. ;
Hall, Robert D. .
NATURE PROTOCOLS, 2007, 2 (04) :778-791
[16]   An integrated proteomics and transcriptomics reference data set provides new insights into the Bradyrhizobium japonicum bacteroid metabolism in soybean root nodules [J].
Delmotte, Nathanael ;
Ahrens, Christian H. ;
Knief, Claudia ;
Qeli, Ermir ;
Koch, Marion ;
Fischer, Hans-Martin ;
Vorholt, Julia A. ;
Hennecke, Hauke ;
Pessi, Gabriella .
PROTEOMICS, 2010, 10 (07) :1391-1400
[17]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[18]   Measuring the metabolome: current analytical technologies [J].
Dunn, WB ;
Bailey, NJC ;
Johnson, HE .
ANALYST, 2005, 130 (05) :606-625
[19]   Metabolomics - the link between genotypes and phenotypes [J].
Fiehn, O .
PLANT MOLECULAR BIOLOGY, 2002, 48 (1-2) :155-171
[20]  
Friedman J., 2001, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, V1