A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data

被引:4
作者
Corradi, Luca [1 ]
Fato, Marco [1 ]
Porro, Ivan [1 ]
Scaglione, Silvia [1 ]
Torterolo, Livia [1 ]
机构
[1] Univ Genoa, Comp Sci Syst & Commun Dept, I-16100 Genoa, Italy
关键词
D O I
10.1186/1471-2105-9-480
中图分类号
Q5 [生物化学];
学科分类号
071010 [生物化学与分子生物学]; 081704 [应用化学];
摘要
Background: Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing). This work is not aimed at replacing existing tools, but it provides researchers with a method to analyze large datasets without any hardware or software constraints. Results: An application able to perform the computation and the analysis of gene expression on large datasets has been developed using algorithms provided by dChip. Different tests have been carried out in order to validate the results and to compare the performances obtained on different infrastructures. Validation tests have been performed using a small dataset related to the comparison of HUVEC (Human Umbilical Vein Endothelial Cells) and Fibroblasts, derived from same donors, treated with IFN-alpha. Moreover performance tests have been executed just to compare performances on different environments using a large dataset including about 1000 samples related to Breast Cancer patients. Conclusion: A Grid-enabled software application for the analysis of large Microarray datasets has been proposed. DChip software has been ported on Linux platform and modified, using appropriate parallelization strategies, to permit its execution on both cluster environments and Grid infrastructures. The added value provided by the use of Grid technologies is the possibility to exploit both computational and data Grid infrastructures to analyze large datasets of distributed data. The software has been validated and performances on cluster and Grid environments have been compared obtaining quite good scalability results.
引用
收藏
页数:15
相关论文
共 12 条
[1]
GEMMA - A Grid environment for microarray management and analysis in bone marrow stem cells experiments [J].
Beltrame, Francesco ;
Papadimitropoulos, Adam ;
Porro, Ivan ;
Scaglione, Silvia ;
Schenone, Andrea ;
Torterolo, Livia ;
Viti, Federica .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS, 2007, 23 (03) :382-390
[2]
Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[3]
Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264
[4]
Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection [J].
Li, C ;
Wong, WH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (01) :31-36
[5]
Li C, 2003, ANAL GENE EXPRESSION, P120
[6]
How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results [J].
Millenaar, FF ;
Okyere, J ;
May, ST ;
van Zanten, M ;
Voesenek, LACJ ;
Peeters, AJM .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]
Molecular mechanisms of action of angiopreventive anti-oxidants on endothelial cells: Microarray gene expression analyses [J].
Pfeffer, U ;
Ferrari, N ;
Dell'Eva, R ;
Indraccolo, S ;
Morini, M ;
Noonan, DM ;
Albini, A .
MUTATION RESEARCH-FUNDAMENTAL AND MOLECULAR MECHANISMS OF MUTAGENESIS, 2005, 591 (1-2) :198-211
[8]
A Grid-based solution for management and analysis of microarrays in distributed experiments [J].
Porro, Ivan ;
Torterolo, Livia ;
Corradi, Luca ;
Fato, Marco ;
Papadimitropoulos, Adam ;
Scaglione, Silvia ;
Schenone, Andrea ;
Viti, Federica .
BMC BIOINFORMATICS, 2007, 8 (Suppl 1)
[9]
Evaluation of methods for oligonucleotide array data via quantitative real-time PCR [J].
Qin, LX ;
Beyer, RP ;
Hudson, FN ;
Linford, NJ ;
Morris, DE ;
Kerr, KF .
BMC BIOINFORMATICS, 2006, 7 (1)
[10]
Higher plant glycosyltransferases [J].
Ross, Joe ;
Li, Yi ;
Lim, Eng-Kiat ;
Bowles, Dianna J. .
GENOME BIOLOGY, 2001, 2 (02)