BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data

被引:39
作者
Hein, AMK
Richardson, S
Causton, HC
Ambler, GK
Green, PJ
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Epidemiol & Publ Hlth, London W2 1PG, England
[2] Univ London Imperial Coll Sci Technol & Med, Hammersmith Hosp, MRC, Clin Sci Ctr,Microarray Ctr, London W12 0NN, England
[3] Univ Bristol, Sch Math, Bristol BS8 1TW, Avon, England
关键词
D O I
10.1093/biostatistics/kxi016
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present Bayesian hierarchical models for the analysis of Affymetrix GeneChip data. The approach we take differs from other available approaches in two fundamental aspects. Firstly, we aim to integrate all processing steps of the raw data in a common statistically coherent framework, allowing all components and thus associated errors to be considered simultaneously. Secondly, inference is based on the full posterior distribution of gene expression indices and derived quantities, such as fold changes or ranks, rather than on single point estimates. Measures of uncertainty on these quantities are thus available. The models presented represent the first building block for integrated Bayesian Analysis of Affymetrix GeneChip data: the models take into account additive as well as multiplicative error, gene expression levels are estimated using perfect match and a fraction of mismatch probes and are modeled on the log scale. Background correction is incorporated by modeling true signal and cross-hybridization explicitly, and a need for further normalization is considerably reduced by allowing for array-specific distributions of nonspecific hybridization. When replicate arrays are available for a condition, posterior distributions of condition-specific gene expression indices are estimated directly, by a simultaneous consideration of replicate probe sets, avoiding averaging over estimates obtained from individual replicate arrays. The performance of the Bayesian model is compared to that of standard available point estimate methods on subsets of the well known GeneLogic and Affymetrix spike-in data. The Bayesian model is found to perform well and the integrated procedure presented appears to hold considerable promise for further development.
引用
收藏
页码:349 / 373
页数:25
相关论文
共 17 条
[1]  
*AFF, 2001, STAT ALG REF GUID
[2]   A benchmark for affymetrix GeneChip expression measures [J].
Cope, LM ;
Irizarry, RA ;
Jaffee, HA ;
Wu, ZJ ;
Speed, TP .
BIOINFORMATICS, 2004, 20 (03) :323-331
[3]   affy -: analysis of Affymetrix GeneChip data at the probe level [J].
Gautier, L ;
Cope, L ;
Bolstad, BM ;
Irizarry, RA .
BIOINFORMATICS, 2004, 20 (03) :307-315
[4]   Transformation and normalization of oligonucleotide microarray data [J].
Geller, SC ;
Gregg, JP ;
Hagerman, P ;
Rocke, DM .
BIOINFORMATICS, 2003, 19 (14) :1817-1823
[5]   Maximum likelihood estimation of optimal scaling factors for expression array normalization [J].
Hartemink, AJ ;
Gifford, DK ;
Jaakkola, TS ;
Young, RA .
MICROARRAYS: OPTICAL TECHNOLOGIES AND INFORMATICS, 2001, 4266 :132-140
[6]   Robust estimators for expression analysis [J].
Hubbell, E ;
Liu, WM ;
Mei, R .
BIOINFORMATICS, 2002, 18 (12) :1585-1592
[7]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264
[8]   Summaries of affymetrix GeneChip probe level data [J].
Irizarry, RA ;
Bolstad, BM ;
Collin, F ;
Cope, LM ;
Hobbs, B ;
Speed, TP .
NUCLEIC ACIDS RESEARCH, 2003, 31 (04) :e15
[9]   A high performance test of differential gene expression for oligonucleotide arrays [J].
Lemon, WJ ;
Liyanarachchi, S ;
You, M .
GENOME BIOLOGY, 2003, 4 (10)
[10]  
LEWIN A, 2005, BIOMETRICS