Filtering for increased power for microarray data analysis

被引:202
作者
Hackstadt, Amber J. [1 ]
Hess, Ann M.
机构
[1] Colorado State Univ, Ctr Bioinformat, Ft Collins, CO 80523 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
FALSE DISCOVERY RATE;
D O I
10.1186/1471-2105-10-11
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Due to the large number of hypothesis tests performed during the process of routine analysis of microarray data, a multiple testing adjustment is certainly warranted. However, when the number of tests is very large and the proportion of differentially expressed genes is relatively low, the use of a multiple testing adjustment can result in very low power to detect those genes which are truly differentially expressed. Filtering allows for a reduction in the number of tests and a corresponding increase in power. Common filtering methods include filtering by variance, average signal or MAS detection call (for Affymetrix arrays). We study the effects of filtering in combination with the Benjamini-Hochberg method for false discovery rate control and q-value for false discovery rate estimation. Results: Three case studies are used to compare three different filtering methods in combination with the two false discovery rate methods and three different preprocessing methods. For the case studies considered, filtering by detection call and variance (on the original scale) consistently led to an increase in the number of differentially expressed genes identified. On the other hand, filtering by variance on the log(2) scale had a detrimental effect when paired with MAS5 or PLIER preprocessing methods, even when the testing was done on the log2 scale. A simulation study was done to further examine the effect of filtering by variance. We find that filtering by variance leads to higher power, often with a decrease in false discovery rate, when paired with either of the false discovery rate methods considered. This holds regardless of the proportion of genes which are differentially expressed or whether we assume dependence or independence among genes. Conclusion: The case studies show that both detection call and variance filtering are viable methods of filtering which can increase the number of differentially expressed genes identified. The simulation study demonstrates that when paired with a false discovery rate method, filtering by variance can increase power while still controlling the false discovery rate. Filtering out 50% of probe sets seems reasonable as long as the majority of genes are not expected to be differentially expressed.
引用
收藏
页数:12
相关论文
共 15 条
[1]  
*AFF, 2001, MICR SUIT US GUID VE
[2]  
Affymetrix, GUID PROB LOG INT ER
[3]  
[Anonymous], GENE EXPRESSION OMNI
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   A comparative review of estimates of the proportion unchanged genes and the false discovery rate [J].
Broberg, P .
BMC BIOINFORMATICS, 2005, 6 (1)
[6]  
Dabney A., qvalue: Q-value estimation for false discovery rate control
[7]   Transcriptomic analysis of the cardiac left ventricle in a rodent model of diabetic cardiomyopathy: molecular snapshot of a severe myocardial disease [J].
Glyn-Jones, Sarah ;
Song, Sarah ;
Black, Michael A. ;
Phillips, Anthony R. J. ;
Choong, Soon Y. ;
Cooper, Garth J. S. .
PHYSIOLOGICAL GENOMICS, 2007, 28 (03) :284-293
[8]   Rat toxicogenomic study reveals analytical consistency across microarray platforms [J].
Guo, Lei ;
Lobenhofer, Edward K. ;
Wang, Charles ;
Shippy, Richard ;
Harris, Stephen C. ;
Zhang, Lu ;
Mei, Nan ;
Chen, Tao ;
Herman, Damir ;
Goodsaid, Federico M. ;
Hurban, Patrick ;
Phillips, Kenneth L. ;
Xu, Jun ;
Deng, Xutao ;
Sun, Yongming Andrew ;
Tong, Weida ;
Dragan, Yvonne P. ;
Shi, Leming .
NATURE BIOTECHNOLOGY, 2006, 24 (09) :1162-1169
[9]  
Irizarry R.A., affy: Methods for Affymetrix Oligonucleotide Arrays, 2006
[10]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264