A general framework for multiple testing dependence

被引:244
作者
Leek, Jeffrey T. [3 ]
Storey, John D. [1 ,2 ]
机构
[1] Princeton Univ, Lewis Sigler Inst, Princeton, NJ 08544 USA
[2] Princeton Univ, Dept Mol Biol, Princeton, NJ 08544 USA
[3] Johns Hopkins Univ, Sch Med, Dept Oncol, Baltimore, MD 21287 USA
基金
美国国家卫生研究院;
关键词
empirical null; false discovery rate; latent structure; simultaneous inference; surrogate variable analysis;
D O I
10.1073/pnas.0808709105
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We develop a general framework for performing large-scale significance testing in the presence of arbitrarily strong dependence. We derive a low-dimensional set of random vectors, called a dependence kernel, that fully captures the dependence structure in an observed high-dimensional dataset. This result shows a surprising reversal of the "curse of dimensionality" in the high-dimensional hypothesis testing setting. We show theoretically that conditioning on a dependence kernel is sufficient to render statistical tests independent regardless of the level of dependence in the observed data. This framework for multiple testing dependence has implications in a variety of common multiple testing problems, such as in gene expression studies, brain imaging, and spatial epidemiology.
引用
收藏
页码:18718 / 18723
页数:6
相关论文
共 33 条