Bayesian robust inference for differential gene expression in microarrays with multiple samples

被引:71
作者
Gottardo, R
Raftery, AE
Yeung, KY
Bumgarner, RE
机构
[1] Univ Washington, Dept Stat, Seattle, WA 98195 USA
[2] Univ Washington, Dept Microbiol, Seattle, WA 98195 USA
关键词
affymetrix; Bayesian hierarchical model; Bonferroni adjustment; cDNA microarrays; empirical Bayes; heteroscedasticity; Markov chain Monte Carlo; mixture distribution; outlier; singular distribution; t-distribution;
D O I
10.1111/j.1541-0420.2005.00397.x
中图分类号
Q [生物科学];
学科分类号
07 [理学]; 0710 [生物学]; 09 [农学];
摘要
We consider the problem of identifying differentially expressed genes tinder different conditions using gene expression microarrays. Because of the many steps involved in the experimental process, from hybridization to image analysis, cDNA microarray data often contain outliers. For example, ail outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a robust Bayesian hierarchical model for testing for differential expression. Errors are modeled explicitly using a t-distribution, which accounts for outliers. The model includes an exchangeable prior for the variances, which allows different variances for the genes but still shrinks extreme empirical variances. Our model can be used for testing for differentially expressed genes among multiple samples, and it can distinguish between the different possible patterns of differential expression when there are three or more samples. Parameter estimation is carried out using a novel version of Markov chain Monte Carlo that is appropriate when the model puts mass on subspaces of the full parameter space. The method is illustrated using two publicly available gene expression data sets. We compare our method to six other baseline and commonly used techniques, namely the t-test, the Bonferroni-adjusted t-test, significance analysis of microarrays (SAM), Efron's empirical Bayes, and EBarrays in both its lognormal-normal and gamma-gamma forms. In an experiment with HIV data, our method performed better than these alternatives, on the basis of between-replicate agreement and disagreement.
引用
收藏
页码:10 / 18
页数:9
相关论文
共 26 条
[1]
A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[2]
Bayesian analysis of agricultural field experiments [J].
Besag, J ;
Higdon, D .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 :691-717
[3]
Chen Y, 1997, J Biomed Opt, V2, P364, DOI 10.1117/12.281504
[4]
Chu G., 2002, SAM SIGNIFICANCE ANA
[5]
Statistical tests for differential expression in cDNA microarray experiments [J].
Cui, XQ ;
Churchill, GA .
GENOME BIOLOGY, 2003, 4 (04)
[6]
Multiple hypothesis testing in microarray experiments [J].
Dudoit, S ;
Shaffer, JP ;
Boldrick, JC .
STATISTICAL SCIENCE, 2003, 18 (01) :71-103
[7]
Dudoit S, 2002, STAT SINICA, V12, P111
[8]
Large-scale simultaneous hypothesis testing: The choice of a null hypothesis [J].
Efron, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) :96-104
[9]
Empirical Bayes analysis of a microarray experiment [J].
Efron, B ;
Tibshirani, R ;
Storey, JD ;
Tusher, V .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1151-1160
[10]
SAMPLING-BASED APPROACHES TO CALCULATING MARGINAL DENSITIES [J].
GELFAND, AE ;
SMITH, AFM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1990, 85 (410) :398-409