Higher criticism for detecting sparse heterogeneous mixtures

被引:462
作者
Donoho, D [1 ]
Jin, JS [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
关键词
multiple comparsions; combining many p-values; sparse normal means; thresholding; normalized empirical process;
D O I
10.1214/009053604000000265
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Higher criticism, or second-level significance testing, is a multiple-comparisons concept mentioned in passing by Tukey. It concerns a situation where there are many independent tests of significance and one is interested in rejecting the joint null hypothesis. Tukey suggested comparing the fraction of observed significances at a given alpha-level to the expected fraction under the joint null. In fact, he suggested standardizing the difference of the two quantities and forming a z-score; the resulting z-score tests the significance of the body of significance tests. We consider a generalization, where we maximize this z-score over a range of significance levels 0 < alpha <= alpha(0). We are able to show that the resulting higher critic-ism statistic is effective at resolving a very subtle testing problem: testing whether n normal means are all zero versus the alternative that a small fraction is nonzero. The subtlety of this "sparse normal means" testing problem can be seen from work of Ingster and Jin, who studied such problems in great detail. In their Studies, they identified an interesting range of cases where the small fraction of nonzero means is so small that the alternative hypothesis exhibits little noticeable effect on the distribution of the p-values either for the bulk of the tests or for the few most highly significant tests. In this range, when the amplitude of nonzero means is calibrated with the fraction of nonzero means, the likelihood ratio test for a precisely specified alternative would still succeed in separating the two hypotheses. We show that the higher criticism is successful throughout the same region of amplitude sparsity where the likelihood ratio test would succeed. Since it does not require a specification of the alternative, this shows that higher criticism is in a sense optimally adaptive to unknown sparsity and size of the nonnull effects. While our theoretical work is largely asymptotic, we provide Simulations in finite samples and suggest some possible applications. We also show that higher critcism works well over a range of non-Gaussian cases.
引用
收藏
页码:962 / 994
页数:33
相关论文
共 36 条
[1]  
ABRAMOVICH F, 2000, 200019 STANF U DEP S
[2]   ASYMPTOTIC THEORY OF CERTAIN GOODNESS OF FIT CRITERIA BASED ON STOCHASTIC PROCESSES [J].
ANDERSON, TW ;
DARLING, DA .
ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (02) :193-212
[3]  
[Anonymous], 1987, MULTIPLE COMP PROCED, DOI DOI 10.1002/9780470316672
[4]  
BECKER BJ, 1994, HDB RES SYNTHESIS, pCH15
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   GOODNESS-OF-FIT TEST STATISTICS THAT DOMINATE THE KOLMOGOROV STATISTICS [J].
BERK, RH ;
JONES, DH .
ZEITSCHRIFT FUR WAHRSCHEINLICHKEITSTHEORIE UND VERWANDTE GEBIETE, 1979, 47 (01) :47-59
[7]  
Bickel P.J., 1993, STAT PROBABILITY RAG, P83
[8]  
Borovkov A.A., 1970, NONPARAMETRIC TECHNI, P259
[9]  
BOROVKOV AA, 1968, TEOR VEROYA PRIMEN, V13, P385
[10]  
Box GE., 2011, BAYESIAN INFERENCE S