A fast and flexible method for the segmentation of aCGH data

被引:56
作者
Ben-Yaacov, Erez [1 ]
Eldar, Yonina C. [1 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
关键词
D O I
10.1093/bioinformatics/btn272
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Array Comparative Genomic Hybridization (aCGH) is used to scan the entire genome for variations in DNA copy number. A central task in the analysis of aCGH data is the segmentation into groups of probes sharing the same DNA copy number. Some well known segmentation methods suffer from very long running times, preventing interactive data analysis. Results: We suggest a new segmentation method based on wavelet decomposition and thresholding, which detects significant breakpoints in the data. Our algorithm is over 1000 times faster than leading approaches, with similar performance. Another key advantage of the proposed method is its simplicity and flexibility. Due to its intuitive structure, it can be easily generalized to incorporate several types of side information. Here, we consider two extensions which include side information indicating the reliability of each measurement, and compensating for a changing variability in the measurement noise. The resulting algorithm outperforms existing methods, both in terms of speed and performance, when applied on real high density CGH data.
引用
收藏
页码:I139 / I145
页数:7
相关论文
共 23 条
[1]   Comparative genomic hybridization using oligonucleotide microarrays and total genomic DNA [J].
Barrett, MT ;
Scheffer, A ;
Ben-Dor, A ;
Sampas, N ;
Lipson, D ;
Kincaid, R ;
Tsang, P ;
Curry, B ;
Baird, K ;
Meltzer, PS ;
Yakhini, Z ;
Bruhn, L ;
Laderman, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (51) :17765-17770
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]   A high-resolution survey of deletion polymorphism in the human genome [J].
Conrad, DF ;
Andrews, TD ;
Carter, NP ;
Hurles, ME ;
Pritchard, JK .
NATURE GENETICS, 2006, 38 (01) :75-81
[4]   Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases [J].
de Smith, Adam J. ;
Tsalenko, Anya ;
Sampas, Nick ;
Scheffer, Alicia ;
Yamada, N. Alice ;
Tsang, Peter ;
Ben-Dor, Amir ;
Yakhini, Zohar ;
Ellis, Richard J. ;
Bruhn, Laurakay ;
Laderman, Stephen ;
Froguel, Philippe ;
Blakemore, Alexandra I. F. .
HUMAN MOLECULAR GENETICS, 2007, 16 (23) :2783-2794
[5]   DE-NOISING BY SOFT-THRESHOLDING [J].
DONOHO, DL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1995, 41 (03) :613-627
[6]   Genome scanning with array CGH delineates regional alterations in mouse islet carcinomas [J].
Hodgson G. ;
Hager J.H. ;
Volik S. ;
Hariono S. ;
Wernick M. ;
Moore D. ;
Albertson D.G. ;
Pinkel D. ;
Collins C. ;
Hanahan D. ;
Gray J.W. .
Nature Genetics, 2001, 29 (4) :459-464
[7]   Denoising array-based comparative genomic hybridization data using wavelets [J].
Hsu, L ;
Self, SG ;
Grove, D ;
Randolph, T ;
Wang, K ;
Delrow, JJ ;
Loo, L ;
Porter, P .
BIOSTATISTICS, 2005, 6 (02) :211-226
[8]   Transcript mapping with high-density oligonucleotide tiling arrays [J].
Huber, Wolfgang ;
Toedling, Joern ;
Steinmetz, Lars M. .
BIOINFORMATICS, 2006, 22 (16) :1963-1970
[9]   Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data [J].
Lai, WR ;
Johnson, MD ;
Kucherlapati, R ;
Park, PJ .
BIOINFORMATICS, 2005, 21 (19) :3763-3770
[10]   Efficient calculation of interval scores for DNA copy number data analysis [J].
Lipson, D ;
Aumann, Y ;
Ben-Dor, A ;
Linial, N ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (02) :215-228