ADaCGH: A Parallelized Web-Based Application and R Package for the Analysis of aCGH Data

被引:20
作者
Diaz-Uriarte, Ramon [1 ]
Rueda, Oscar M. [1 ]
机构
[1] Spanish Natl Canc Ctr, Struct Biol & Biocomp Programme, Madrid, Spain
来源
PLOS ONE | 2007年 / 2卷 / 08期
关键词
D O I
10.1371/journal.pone.0000737
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background. Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies. Methodology/Principal Findings. ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers. Conclusions/Significance. ADaCGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45x); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e. g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects.
引用
收藏
页数:10
相关论文
共 49 条
[1]   High-resolution characterization of the pancreatic adenocarcinoma genome [J].
Aguirre, AJ ;
Brennan, C ;
Bailey, G ;
Sinha, R ;
Feng, B ;
Leo, C ;
Zhang, YY ;
Zhang, J ;
Gans, JD ;
Bardeesy, N ;
Cauwels, C ;
Cordon-Cardo, C ;
Redston, MS ;
DePinho, RA ;
Chin, L .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (24) :9067-9072
[2]   IDconverter and IDClight:: Conversion and annotation of gene and protein IDs [J].
Alibes, Andreu ;
Yankilevich, Patricio ;
Canada, Andres ;
Diaz-Uriarte, Ramon .
BMC BIOINFORMATICS, 2007, 8
[3]  
ANDERSON B, 2006, BMC GENOMICS, V7
[4]  
[Anonymous], 1995, Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
[5]  
BAXTER SM, 2006, PLOS COMPUTATIONAL B, V2
[6]  
CHARI R, 2006, CANC INFORM, V2
[7]  
CHEN W, 2005, BMC BIOINFORMATICS, V6
[8]  
CONDE L, 2007, NUCL ACIDS IN PRESS
[9]  
Díaz-Uriarte R, 2005, DATA ANALYSIS AND VISUALIZATION IN GENOMICS AND PROTEOMICS, P193, DOI 10.1002/0470094419.ch12
[10]  
Dongarra J., 2007, CTWATCH Q, V3, P1