Leveraging two-way prob-level block design for identifying differential gene expression with high-density oligonucleotide arrays

被引:22
作者
Barrera, L
Benner, C
Tao, YC
Winzeler, E
Zhou, YY
机构
[1] Novartis Res Fdn, Genom Inst, San Diego, CA 92121 USA
[2] Univ Calif San Diego, Bioinformat Grad Program, La Jolla, CA 92093 USA
[3] Novartis Inst Biomed Res, Cambridge, MA 02139 USA
[4] Scripps Res Inst, La Jolla, CA 92037 USA
关键词
D O I
10.1186/1471-2105-5-42
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: To identify differentially expressed genes across experimental conditions in oligonucleotide microarray experiments, existing statistical methods commonly use a summary of probe-level expression data for each probe set and compare replicates of these values across conditions using a form of the t-test or rank sum test. Here we propose the use of a statistical method that takes advantage of the built-in redundancy architecture of high-density oligonucleotide arrays. Results: We employ parametric and nonparametric variants of two-way analysis of variance ( ANOVA) on probe-level data to account for probe-level variation, and use the false-discovery rate (FDR) to account for simultaneous testing on thousands of genes ( multiple testing problem). Using publicly available data sets, we systematically compared the performance of parametric two-way ANOVA and the nonparametric Mack-Skillings test to the t-test and Wilcoxon rank-sum test for detecting differentially expressed genes at varying levels of fold change, concentration, and sample size. Using receiver operating characteristic (ROC) curve comparisons, we observed that two-way methods with FDR control on sample sizes with 2-3 replicates exhibits the same high sensitivity and specificity as a t-test with FDR control on sample sizes with 6-9 replicates in detecting at least two-fold change. Conclusions: Our results suggest that the two-way ANOVA methods using probe-level data are substantially more powerful tests for detecting differential gene expression than corresponding methods for probe-set level data.
引用
收藏
页数:14
相关论文
共 23 条
[1]  
*AFF INC, AFF LAT SQUAR DAT EX
[2]  
[Anonymous], GENOME BIOL
[3]  
[Anonymous], 1991, APPL MULTIVARIATE DA
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   A systematic statistical linear modeling approach to oligonucleotide array experiments [J].
Chu, TM ;
Weir, B ;
Wolfinger, R .
MATHEMATICAL BIOSCIENCES, 2002, 176 (01) :35-51
[6]  
Dudoit S, 2002, STAT SINICA, V12, P111
[7]  
Hoffmann R, 2002, GENOME BIOL, V3
[8]  
Hollander M., Nonparametric Statistical Methods, 2nd Edition, V2nd
[9]   A high performance test of differential gene expression for oligonucleotide arrays [J].
Lemon, WJ ;
Liyanarachchi, S ;
You, M .
GENOME BIOLOGY, 2003, 4 (10)
[10]   Theoretical and experimental comparisons of gene expression indexes for oligonucleotide arrays [J].
Lemon, WJ ;
Palatini, JJT ;
Krahe, R ;
Wright, FA .
BIOINFORMATICS, 2002, 18 (11) :1470-1476