PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data

被引:149
作者
Greenman, Chris D. [1 ]
Bignell, Graham [1 ]
Butler, Adam [1 ]
Edkins, Sarah [1 ]
Hinton, Jon [1 ]
Beare, Dave [1 ]
Swamy, Sajani [1 ]
Santarius, Thomas [1 ]
Chen, Lina [1 ]
Widaa, Sara [1 ]
Futreal, P. Andy [1 ]
Stratton, Michael R. [1 ]
机构
[1] Wellcome Trust Sanger Inst, Canc Genome Project, Cambridge CB10 1SA, England
基金
英国惠康基金;
关键词
Allelic; Cancer; Copy; Number; Somatic; Variation; AFFYMETRIX SNP ARRAYS; HIDDEN MARKOV-MODELS; GENOTYPING ALGORITHM; CGH DATA; GENOME; POLYMORPHISMS; POPULATION; MIXTURE;
D O I
10.1093/biostatistics/kxp045
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
High-throughput oligonucleotide microarrays are commonly employed to investigate genetic disease, including cancer. The algorithms employed to extract genotypes and copy number variation function optimally for diploid genomes usually associated with inherited disease. However, cancer genomes are aneuploid in nature leading to systematic errors when using these techniques. We introduce a preprocessing transformation and hidden Markov model algorithm bespoke to cancer. This produces genotype classification, specification of regions of loss of heterozygosity, and absolute allelic copy number segmentation. Accurate prediction is demonstrated with a combination of independent experimental techniques. These methods are exemplified with affymetrix genome-wide SNP6.0 data from 755 cancer cell lines, enabling inference upon a number of features of biological interest. These data and the coded algorithm are freely available for download.
引用
收藏
页码:164 / 175
页数:12
相关论文
共 37 条
[1]  
*AFF INC, 2006, BRLMM P GEN CALL ME
[2]  
Affymetrix, 2006, CISC VIS NETW IND GL, V1, P18
[3]   Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data [J].
Baross, Agnes ;
Delaney, Allen D. ;
Li, H. Irene ;
Nayar, Tarun ;
Flibotte, Stephane ;
Qian, Hong ;
Chan, Susanna Y. ;
Asano, Jennifer ;
Ally, Adrian ;
Cao, Manqiu ;
Birch, Patricia ;
Brown-John, Mabel ;
Fernandes, Nicole ;
Go, Anne ;
Kennedy, Giulia ;
Langlois, Sylvie ;
Eydoux, Patrice ;
Friedman, J. M. ;
Marra, Marco A. .
BMC BIOINFORMATICS, 2007, 8 (1)
[4]  
Beroukhim R, 2006, PLOS COMPUT BIOL, V2, P323, DOI 10.1371/journal.pcbi.0020041
[5]   High-resolution analysis of DNA copy number using oligonucleotide microarrays [J].
Bignell, GR ;
Huang, J ;
Greshock, J ;
Watt, S ;
Butler, A ;
West, S ;
Grigorova, M ;
Jones, KW ;
Wei, W ;
Stratton, MR ;
Futreal, PA ;
Weber, B ;
Shapero, MH ;
Wooster, R .
GENOME RESEARCH, 2004, 14 (02) :287-295
[6]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[7]   Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data [J].
Carvalho, Benilton ;
Bengtsson, Henrik ;
Speed, Terence P. ;
Irizarry, Rafael A. .
BIOSTATISTICS, 2007, 8 (02) :485-499
[8]   QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data [J].
Colella, Stefano ;
Yau, Christopher ;
Taylor, Jennifer M. ;
Mirza, Ghazala ;
Butler, Helen ;
Clouston, Penny ;
Bassett, Anne S. ;
Seller, Anneke ;
Holmes, Christopher C. ;
Ragoussis, Jiannis .
NUCLEIC ACIDS RESEARCH, 2007, 35 (06) :2013-2025
[9]   Hidden Markov models approach to the analysis of array CGH data [J].
Fridlyand, J ;
Snijders, AM ;
Pinkel, D ;
Albertson, DG ;
Jain, AN .
JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) :132-153
[10]   GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population [J].
Giannoulatou, Eleni ;
Yau, Christopher ;
Colella, Stefano ;
Ragoussis, Jiannis ;
Holmes, Christopher C. .
BIOINFORMATICS, 2008, 24 (19) :2209-2214