CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data

被引:46
作者
Zhang, Qunyuan [1 ]
Ding, Li [2 ]
Larson, David E. [2 ]
Koboldt, Daniel C. [2 ]
McLellan, Michael D. [2 ]
Chen, Ken [2 ]
Shi, Xiaoqi [2 ]
Kraja, Aldi [1 ]
Mardis, Elaine R. [2 ]
Wilson, Richard K. [2 ]
Borecki, Ingrid B. [1 ]
Province, Michael A. [1 ]
机构
[1] Washington Univ, Sch Med, Div Stat Genom, St Louis, MO 63130 USA
[2] Washington Univ, Sch Med, Genome Ctr, St Louis, MO USA
关键词
ARRAY-CGH DATA; COMPARATIVE GENOMIC HYBRIDIZATION; REGIONS; WAVES;
D O I
10.1093/bioinformatics/btp708
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: DNA copy number aberration (CNA) is a hallmark of genomic abnormality in tumor cells. Recurrent CNA (RCNA) occurs in multiple cancer samples across the same chromosomal region and has greater implication in tumorigenesis. Current commonly used methods for RCNA identification require CNA calling for individual samples before cross-sample analysis. This two-step strategy may result in a heavy computational burden, as well as a loss of the overall statistical power due to segmentation and discretization of individual sample's data. We propose a population-based approach for RCNA detection with no need of single-sample analysis, which is statistically powerful, computationally efficient and particularly suitable for high-resolution and large-population studies. Results: Our approach, correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis. Directly using the raw intensity ratio data from all samples and adopting a diagonal transformation strategy, CMDS substantially reduces computational burden and can obtain results very quickly from large datasets. Our simulation indicates that the statistical power of CMDS is higher than that of single-sample CNA calling based two-step approaches. We applied CMDS to two real datasets of lung cancer and brain cancer from Affymetrix and Illumina array platforms, respectively, and successfully identified known regions of CNA associated with EGFR, KRAS and other important oncogenes. CMDS provides a fast, powerful and easily implemented tool for the RCNA analysis of large-scale data from cancer genomes.
引用
收藏
页码:464 / 469
页数:6
相关论文
共 21 条
  • [1] Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma
    Beroukhim, Rameen
    Getz, Gad
    Nghiemphu, Leia
    Barretina, Jordi
    Hsueh, Teli
    Linhart, David
    Vivanco, Igor
    Lee, Jeffrey C.
    Huang, Julie H.
    Alexander, Sethu
    Du, Jinyan
    Kau, Tweeny
    Thomas, Roman K.
    Shah, Kinial
    Soto, Horacio
    Perner, Sven
    Prensner, John
    Debiasi, Ralph M.
    Demichelis, Francesca
    Hatton, Charlie
    Rubin, Mark A.
    Garraway, Levi A.
    Nelson, Stan F.
    Liau, Linda
    Mischel, Paul S.
    Cloughesy, Tim F.
    Meyerson, Matthew
    Golub, Todd A.
    Lander, Eric S.
    Mellinghoff, Ingo K.
    Sellers, William R.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (50) : 20007 - 20012
  • [2] Comprehensive genomic characterization defines human glioblastoma genes and core pathways
    Chin, L.
    Meyerson, M.
    Aldape, K.
    Bigner, D.
    Mikkelsen, T.
    VandenBerg, S.
    Kahn, A.
    Penny, R.
    Ferguson, M. L.
    Gerhard, D. S.
    Getz, G.
    Brennan, C.
    Taylor, B. S.
    Winckler, W.
    Park, P.
    Ladanyi, M.
    Hoadley, K. A.
    Verhaak, R. G. W.
    Hayes, D. N.
    Spellman, Paul T.
    Absher, D.
    Weir, B. A.
    Ding, L.
    Wheeler, D.
    Lawrence, M. S.
    Cibulskis, K.
    Mardis, E.
    Zhang, Jinghui
    Wilson, R. K.
    Donehower, L.
    Wheeler, D. A.
    Purdom, E.
    Wallis, J.
    Laird, P. W.
    Herman, J. G.
    Schuebel, K. E.
    Weisenberger, D. J.
    Baylin, S. B.
    Schultz, N.
    Yao, Jun
    Wiedemeyer, R.
    Weinstein, J.
    Sander, C.
    Gibbs, R. A.
    Gray, J.
    Kucherlapati, R.
    Lander, E. S.
    Myers, R. M.
    Perou, C. M.
    McLendon, Roger
    [J]. NATURE, 2008, 455 (7216) : 1061 - 1068
  • [3] STAC: A method for testing the significance of DNA copy number aberrations across multiple array-CGH experiments
    Diskin, Sharon J.
    Eck, Thomas
    Greshock, Joel
    Mosse, Yael P.
    Naylor, Tara
    Stoeckert, Christian J., Jr.
    Weber, Barbara L.
    Maris, John M.
    Grant, Gregory R.
    [J]. GENOME RESEARCH, 2006, 16 (09) : 1149 - 1158
  • [4] Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms
    Diskin, Sharon J.
    Li, Mingyao
    Hou, Cuiping
    Yang, Shuzhang
    Glessner, Joseph
    Hakonarson, Hakon
    Bucan, Maja
    Maris, John M.
    Wang, Kai
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 (19)
  • [5] Assessing the significance of conserved genomic aberrations using high resolution genomic microarrays
    Guttman, Mitchell
    Mies, Carolyn
    Dudycz-Sulicz, Katarzyna
    Diskin, Sharon J.
    Baldwin, Don A.
    Stoeckert, Christian J., Jr.
    Grant, Gregory R.
    [J]. PLOS GENETICS, 2007, 3 (08): : 1464 - 1486
  • [6] Denoising array-based comparative genomic hybridization data using wavelets
    Hsu, L
    Self, SG
    Grove, D
    Randolph, T
    Wang, K
    Delrow, JJ
    Loo, L
    Porter, P
    [J]. BIOSTATISTICS, 2005, 6 (02) : 211 - 226
  • [7] Analysis of array CGH data:: from signal ratio to gain and loss of DNA regions
    Hupé, P
    Stransky, N
    Thiery, JP
    Radvanyi, F
    Barillot, E
    [J]. BIOINFORMATICS, 2004, 20 (18) : 3413 - 3422
  • [8] Breakpoint identification and smoothing of array comparative genomic hybridization data
    Jong, K
    Marchiori, E
    Meijer, G
    Van der Vaart, A
    Ylstra, B
    [J]. BIOINFORMATICS, 2004, 20 (18) : 3636 - 3637
  • [9] Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data
    Lai, WR
    Johnson, MD
    Kucherlapati, R
    Park, PJ
    [J]. BIOINFORMATICS, 2005, 21 (19) : 3763 - 3770
  • [10] A statistical method to detect chromosomal regions with DNA copy number alterations using SNP-array-based CGH data
    Lai, YL
    Zhao, HY
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2005, 29 (01) : 47 - 54