PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data

被引:39
作者
Laframboise, Thomas [1 ]
Harrington, David
Weir, Barbara A.
机构
[1] Dana Farber Canc Inst, Dept Med Oncol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
allelic imbalance; cancer genomics; DNA copy number; expectation-maximization algorithm; generalized linear model; single nucleotide polymorphism array; POLYMORPHISM; NORMALIZATION; MICROARRAYS; ALGORITHM; SIGNAL;
D O I
10.1093/biostatistics/kxl012
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Human cancer is largely driven by the acquisition of mutations. One class of such mutations is copy number polymorphisms, comprised of deviations from the normal diploid two copies of each autosomal chromosome per cell. We describe a probe-level allele-specific quantitation (PLASQ) procedure to determine copy number contributions from each of the parental chromosomes in cancer cells from single-nucleotide polymorphism (SNP) microarray data. Our approach is based upon a generalized linear model that takes advantage of a novel classification of probes on the array. As a result of this classification, we are able to fit the model to the data using an expectation-maximization algorithm designed for the purpose. We demonstrate a strong model fit to data from a variety of cell types. In normal diploid samples, PLASQ is able to genotype with very high accuracy. Moreover, we are able to provide a generalized genotype in cancer samples (e.g. CCCCT at an amplified SNP). Our approach is illustrated on a variety of lung cancer cell lines and tumors, and a number of events are validated by independent computational and experimental means. An R software package containing the methods is freely available.
引用
收藏
页码:323 / 336
页数:14
相关论文
共 19 条
[1]  
[Anonymous], GENECHIP HUM MAPP 10
[2]   High-resolution analysis of DNA copy number using oligonucleotide microarrays [J].
Bignell, GR ;
Huang, J ;
Greshock, J ;
Watt, S ;
Butler, A ;
West, S ;
Grigorova, M ;
Jones, KW ;
Wei, W ;
Stratton, MR ;
Futreal, PA ;
Weber, B ;
Shapero, MH ;
Wooster, R .
GENOME RESEARCH, 2004, 14 (02) :287-295
[3]   A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[4]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[5]  
Huang Jing, 2004, Human Genomics, V1, P287
[6]   Analysis of array CGH data:: from signal ratio to gain and loss of DNA regions [J].
Hupé, P ;
Stransky, N ;
Thiery, JP ;
Radvanyi, F ;
Barillot, E .
BIOINFORMATICS, 2004, 20 (18) :3413-3422
[7]   Detection of large-scale variation in the human genome [J].
Iafrate, AJ ;
Feuk, L ;
Rivera, MN ;
Listewnik, ML ;
Donahoe, PK ;
Qi, Y ;
Scherer, SW ;
Lee, C .
NATURE GENETICS, 2004, 36 (09) :949-951
[8]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264
[9]   Allelic dosage analysis with genotyping microarrays [J].
Ishikawa, S ;
Komura, D ;
Tsuji, S ;
Nishimura, K ;
Yamamoto, S ;
Panda, B ;
Huang, J ;
Fukayama, M ;
Jones, KW ;
Aburatani, H .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2005, 333 (04) :1309-1314
[10]   Allele-specific amplification in cancer revealed by SNP array analysis [J].
LaFramboise, T ;
Weir, BA ;
Zhao, XJ ;
Beroukhim, R ;
Li, C ;
Harrington, D ;
Sellers, WR ;
Meyerson, M .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (06) :507-517