KC-SMARTR: An R package for detection of statistically significant aberrations in multi-experiment aCGH data

被引:18
作者
De Ronde J.J. [1 ]
Klijn C. [1 ]
Velds A. [3 ]
Holstege H. [2 ]
Reinders M.J. [1 ,4 ]
Jonkers J. [2 ]
Wessels L.F. [1 ,4 ]
机构
[1] Department of Bioinformatics and Statistics, Netherlands Cancer Institute, 1066CX Amsterdam
[2] Department of Molecular Biology, Netherlands Cancer Institute, 1066CX Amsterdam
[3] Central Microarray Facility, Netherlands Cancer Institute, 1066CX Amsterdam
[4] Faculty of EEMCS, Delft University of Technology
关键词
Kernel Convolution; Kernel Width; Aberrate Region; Copy Number Alteration; aCGH Analysis;
D O I
10.1186/1756-0500-3-298
中图分类号
学科分类号
摘要
Background: Most approaches used to find recurrent or differential DNA Copy Number Alterations (CNA) in array Comparative Genomic Hybridization (aCGH) data from groups of tumour samples depend on the discretization of the aCGH data to gain, loss or no-change states. This causes loss of valuable biological information in tumour samples, which are frequently heterogeneous. We have previously developed an algorithm, KC-SMART, that bases its estimate of the magnitude of the CNA at a given genomic location on kernel convolution (Klijn et al., 2008). This accounts for the intensity of the probe signal, its local genomic environment and the signal distribution across multiple samples. Results. Here we extend the approach to allow comparative analyses of two groups of samples and introduce the R implementation of these two approaches. The comparative module allows for a supervised analysis to be performed, to enable the identification of regions that are differentially aberrated between two user-defined classes. We analyzed data from a series of B- and T-cell lymphomas and were able to retrieve all positive control regions (VDJ regions) in addition to a number of new regions. A t-test employing segmented data, that we implemented, was also able to locate all the positive control regions and a number of new regions but these regions were highly fragmented. Conclusions. KC-SMARTR offers recurrent CNA and class specific CNA detection, at different genomic scales, in a single package without the need for additional segmentation. It is memory efficient and runs on a wide range of machines. Most importantly, it does not rely on data discretization and therefore maximally exploits the biological information in the aCGH data. The program is freely available from the Bioconductor website http://www.bioconductor.org/ under the terms of the GNU General Public License. © 2009 de Ronde et al; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 8 条
[1]  
Hanahan D., Weinberg R.A., The hallmarks of Cancer, Cell, 100, pp. 57-70, (2000)
[2]  
Fiegler H., Geigl J.B., Langer S., Rigler D., Porter K., Unger K., Carter N.P., Speicher M.R., High resolution array-CGH analysis of single cells, Nucleic Acids Res, 35, (2007)
[3]  
Klijn C., Holstege H., De Ridder J., Liu X., Reinders M., Jonkers J., Wessels L., Identication of cancer genes using a statistical framework for multiexperiment analysis of nondiscretized array CGH data, Nucleic Acids Res, 36, (2008)
[4]  
Tusher V.G., Tibshirani R., Chu G., Significance analysis of microarrays applied to the ionizing radiation response, Proc Natl Acad Sci USA, 98, 9, pp. 5116-21, (2001)
[5]  
Chin K., Devries S., Fridlyand J., Spellman P.T., Roydasgupta R., Kuo W.L., Lapuk A., Neve R.M., Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell, 10, 6, pp. 529-41, (2006)
[6]  
Holstege H., Van Beers E., Velds A., Liu X., Joosse S.A., Klarenbeek S., Schut E., Kerkhoven R.K., Et al., Cross-species comparison of aCGH data from mouse and human BRCA1- and BRCA2-mutated breast cancers, BMC Cancer, 10, (2010)
[7]  
Klijn C., Bot J., Adams D.J., Reinders M., Wessels L., Jonkers J., Identification of networks of co-occurring, tumor-related DNA copy number changes using a genome-wide scoring approach, PLoS Comput Biol, 1, (2010)
[8]  
Venkatraman E.S., Olshen A.B., A faster circular binary segmentation algorithm for the analysis of array CGH data, Bioinformatics, 23, 6, pp. 657-63, (2007)