Testing against a high dimensional alternative

被引:191
作者
Goeman, JJ [1 ]
van de Geer, SA [1 ]
van Houwelingen, HC [1 ]
机构
[1] Leiden Univ, Ctr Med, Dept Med Stat, NL-2300 RC Leiden, Netherlands
关键词
empirical Bayes modelling; F-test; high dimensional data; hypothesis testing; locally most powerful test; power; score test;
D O I
10.1111/j.1467-9868.2006.00551.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
As the dimensionality of the alternative hypothesis increases, the power of classical tests tends to diminish quite rapidly. This is especially true for high dimensional data in which there are more parameters than observations. We discuss a score test on a hyperparameter in an empirical Bayesian model as an alternative to classical tests. It gives a general test statistic which can be used to test a point null hypothesis against a high dimensional alternative, even when the number of parameters exceeds the number of samples. This test will be shown to have optimal power on average in a neighbourhood of the null hypothesis, which makes it a proper generalization of the locally most powerful test to multiple dimensions. To illustrate this new locally most powerful test we investigate the case of testing the global null hypothesis in a linear regression model in more detail. The score test is shown to have significantly more power than the F-test whenever under the alternative the large variance principal components of the design matrix explain substantially more of the variance of the outcome than do the small variance principal components. The score test is also useful for detecting sparse alternatives in truly high dimensional data, where its power is comparable with the test based on the maximum absolute t-statistic.
引用
收藏
页码:477 / 493
页数:17
相关论文
共 14 条
[1]  
AZZALINI A, 1993, J ROY STAT SOC B MET, V55, P549
[2]  
Bartholomew D. J., 1999, LATENT VARIABLE MODE
[3]  
Bernardo J., 2009, Bayesian theory
[4]  
Brown P. J., 1993, MEASUREMENT REGRESSI
[5]  
COX D. R., 2000, Theoretical Statistics
[6]   Testing association of a pathway with survival using gene expression data [J].
Goeman, JJ ;
Oosting, J ;
Cleton-Jansen, AM ;
Anninga, JK ;
van Houwelingen, HC .
BIOINFORMATICS, 2005, 21 (09) :1950-1957
[7]   A global test for groups of genes: testing association with a clinical outcome [J].
Goeman, JJ ;
van de Geer, SA ;
de Kort, F ;
van Houwelingen, HC .
BIOINFORMATICS, 2004, 20 (01) :93-99
[8]   RIDGE REGRESSION - BIASED ESTIMATION FOR NONORTHOGONAL PROBLEMS [J].
HOERL, AE ;
KENNARD, RW .
TECHNOMETRICS, 1970, 12 (01) :55-&
[9]   COMPUTING DISTRIBUTION OF QUADRATIC FORMS IN NORMAL VARIABLES [J].
IMHOF, JP .
BIOMETRIKA, 1961, 48 (3-4) :419-&
[10]   UNBALANCED REPEATED-MEASURES MODELS WITH STRUCTURED COVARIANCE MATRICES [J].
JENNRICH, RI ;
SCHLUCHTER, MD .
BIOMETRICS, 1986, 42 (04) :805-820