Comparing three methods for variance estimation with duplicated high density oligonucleotide arrays

被引:23
作者
Huang X. [1 ]
Pan W. [1 ]
机构
[1] Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis
基金
美国国家卫生研究院;
关键词
Differential gene expression; Microarray; Nonparametric smoothing;
D O I
10.1007/s10142-002-0066-2
中图分类号
学科分类号
摘要
Microarray experiments are being increasingly used in molecular biology. A common task is to detect genes with differential expression across two experimental conditions, such as two different tissues or the same tissue at two time points of biological development. To take proper account of statistical variability, some statistical approaches based on the t-statistic have been proposed. In constructing the t-statistic, one needs to estimate the variance of gene expression levels. With a small number of replicated array experiments, the variance estimation can be challenging. For instance, although the sample variance is unbiased, it may have large variability, leading to a large mean squared error. For duplicated array experiments, a new approach based on simple averaging has recently been proposed in the literature. Here we consider two more general approaches based on nonparametric smoothing. Our goal is to assess the performance of each method empirically. The three methods are applied to a colon cancer data set containing 2,000 genes. Using two arrays, we compare the variance estimates obtained from the three methods. We also consider their impact on the t-statistics. Our results indicate that the three methods give variance estimates close to each other. Due to its simplicity and generality, we recommend the use of the smoothed sample variance for data with a small number of replicates. © Springer-Verlag 2002.
引用
收藏
页码:126 / 133
页数:7
相关论文
共 26 条
[1]  
Alon U., Barkai N., Notterman D.A., Gish K., Ybarra S., Mack D., Levine A.J., Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proc Natl Acad Sci, 96, pp. 6745-6750, (1999)
[2]  
Baggerly K.A., Coombes K.R., Hess K.R., Stivers D.N., Abruzzo L.V., Zhang W., Identifying differentially expressed genes in cDNA microarray experiments, J Comput Biol, 8, pp. 639-659, (2001)
[3]  
Baldi P., Long A.D., A Bayesian framework for the analysis of microarray expression data: Regularized t-test and statistical inferences of gene changes, Bioinformatics, 17, pp. 509-519, (2001)
[4]  
Brown P., Botstein D., Exploring the new world of the genome with DNA microarrays, Nat Genet Suppl, 21, pp. 33-37, (1999)
[5]  
Chen Y., Dougherty E.R., Bittner M.L., Ratio-based decisions and the quantitative analysis of cDNA microarray images, J Biomed Opt, 2, pp. 364-367, (1997)
[6]  
Cleveland W., Devlin S.J., Locally weighted regression: An approach to regression analysis by local fitting, J Am Stat Assoc, 83, pp. 596-610, (1988)
[7]  
Dudoit S., Yang Y.H., Callow M.J., Speed T.P., Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments, Statistica Sinica, 12, pp. 111-139, (2002)
[8]  
Efron B., Tibshirani R., Storey J.D., Tusher V., Empirical Bayes analysis of a microarray experiment, J Am Stat Assoc, 96, pp. 1151-1160, (2001)
[9]  
Ideker T., Thorsson V., Siehel A.F., Hood L.E., Testing for differentially-expressed genes by maximum likelihood analysis of microarray data, J Comput Biol, 7, pp. 805-817, (2000)
[10]  
Kamb A., Ramaswami M., A simple method for statistical analysis of intensity differences in microarray-derived gene expression data, Biotechnology, 1, (2001)