Estimation and confidence sets for sparse normal mixtures

被引:63
作者
Cai, T. Tony [1 ]
Jin, Jiashun [2 ]
Low, Mark G. [1 ]
机构
[1] Univ Penn, Wharton Sch, Dept Stat, Philadelphia, PA 19104 USA
[2] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
关键词
confidence lower bound; estimating fraction; higher criticism; minimax estimation; optimally adaptive; sparse normal mixture;
D O I
10.1214/009053607000000334
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
For high dimensional statistical models, researchers have begun to focus on situations which can be described as having relatively few moderately large coefficients. Such situations lead to some very subtle statistical problems. In particular, Ingster and Donoho and Jin have considered a sparse normal means testing problem, in which they described the precise demarcation or detection boundary. Meinshausen and Rice have shown that it is even possible to estimate consistently the fraction of nonzero coordinates on a subset of the detectable region, but leave unanswered the question of exactly in which parts of the detectable region consistent estimation is possible. In the present paper we develop a new approach for estimating the fraction of nonzero means for problems where the nonzero means are moderately large. We show that the detection region described by Ingster and Donoho and Jin turns out to be the region where it is possible to consistently estimate the expected fraction of nonzero coordinates. This theory is developed further and minimax rates of convergence are derived. A procedure is constructed which attains the optimal rate of convergence in this setting. Furthermore, the procedure also provides an honest lower bound for confidence intervals while minimizing the expected length of such an interval. Simulations are used to enable comparison with the work of Meinshausen and Rice, where a procedure is given but where rates of convergence have not been discussed. Extensions to more general Gaussian mixture models are also given.
引用
收藏
页码:2421 / 2449
页数:29
相关论文
共 15 条
[1]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[2]  
CAI T, 2006, ESTIMATION CONFIDENC
[3]   An adaptation theory for nonparametric confidence intervals [J].
Cai, TT ;
Low, MG .
ANNALS OF STATISTICS, 2004, 32 (05) :1805-1840
[4]   Higher criticism for detecting sparse heterogeneous mixtures [J].
Donoho, D ;
Jin, JS .
ANNALS OF STATISTICS, 2004, 32 (03) :962-994
[5]   ONE-SIDED INFERENCE ABOUT FUNCTIONALS OF A DENSITY [J].
DONOHO, DL .
ANNALS OF STATISTICS, 1988, 16 (04) :1390-1420
[6]   Large-scale simultaneous hypothesis testing: The choice of a null hypothesis [J].
Efron, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) :96-104
[7]   A stochastic process approach to false discovery control [J].
Genovese, C ;
Wasserman, L .
ANNALS OF STATISTICS, 2004, 32 (03) :1035-1061
[8]  
Ingster Yu.I, 1998, MATH METHODS STAT, V7, P401
[9]  
JIN J, 2004, I MATH STAT MONOGRAP, V45, P255
[10]  
JIN J, 2006, PROPORTION NONZERO N