Model selection via testing:: an alternative to (penalized) maximum likelihood estimators

被引:58
作者
Birgé, L [1 ]
机构
[1] Univ Paris 06, Probabil Lab, UMR 7599, F-75252 Paris 05, France
来源
ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES | 2006年 / 42卷 / 03期
关键词
maximum likelihood; robustness; robust tests; metric dimension; minimax risk; model selection; aggregation of estimators;
D O I
10.1016/j.anihpb.2005.04.004
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper is devoted to the definition and study of a family of model selection oriented estimators that we shall call T-estimators ("T" for tests). Their construction is based on former ideas about deriving estimators from some families of tests due to Le Cam [L.M. Le Cam, Convergence of estimates under dimensionality restrictions, Ann. Statist. 1 (1973) 38-53 and L.M. Le Cam, On local and global properties in the theory of asymptotic normality of experiments, in: M. Puri (Ed.), Stochastic Processes and Related Topics, vol. 1, Academic Press, New York, 1975, pp. 13-54] and Birge [L. Birge, Approximation dans les espaces metriques et theoric de l'estimation, Z. Wahrscheinlichkeitstheorie Verw. Gebiete 65 (1983) 181-237, L. Birge, Sur un theoreme de minimax et son application aux tests, Probab. Math. Statist. 3 (1984) 259-282 and L. Birge, Stabilite et instabilite du risque minimax pour des variables independantes equidistribuees, Ann. Inst. H. Poincare Sect. B 20 (1984) 201-223] and about complexity based model selection from Barron and Cover (A.R. Barron, T.M. Cover, Minimum complexity density estimation, IEEE Trans. Inform. Theory 37 (1991) 1034-1054]. It is well-known that maximum likelihood estimators and, more generally, minimum contrast estimators do suffer from various weaknesses, and their penalized versions as well. In particular they are not robust and they require restrictive assumptions on both the models and the underlying parameter set to work correctly. We propose an alternative construction, which derives an estimator from many simultaneous tests between some probability balls in a suitable metric space. In many cases, although not in all, it results in a penalized M-estimator restricted to a suitable countable set of parameters. On the one hand, this construction should be considered as a theoretical rather than a practical tool because of its high computational complexity. On the other hand, it solves many of the previously mentioned difficulties provided that the tests involved in our construction exist, which is the case for various statistical frameworks including density estimation from i.i.d. variables or estimating the mean of a Gaussian sequence with a known variance. For all such frameworks, the robustness properties of our estimators allow to deal with minimax estimation and model selection in a unified way, since bounding the minimax risk amounts to performing our method with a single, well-chosen, model. This results, for those frameworks, in simple bounds for the minimax risk solely based on some metric properties of the parameter space. Moreover the method applies to various statistical frameworks and can handle essentially all types of models, linear or not, parametric and non-parametric, simultaneously. It also provides a simple way of aggregating preliminary estimators. From these viewpoints. it is much more flexible than traditional methods and allows to derive some results that do not presently seem to be accessible to them. (c) 2005 Elsevier SAS. All rights reserved.
引用
收藏
页码:273 / 325
页数:53
相关论文
共 74 条
[1]  
[Anonymous], 2001, Journal of the European Mathematical Society, DOI DOI 10.1007/S100970100031
[2]  
[Anonymous], 1966, APPROXIMATION FUNCTI
[3]  
ASSOUAD P, 1983, CR ACAD SCI I-MATH, V296, P1021
[4]  
AUDIBERT JY, 2004, THESIS U PARIS 6 PAR
[5]  
Baraud Y., 2002, ESAIM Probability and Statistics, V6, P127
[6]   Risk bounds for model selection via penalization [J].
Barron, A ;
Birgé, L ;
Massart, P .
PROBABILITY THEORY AND RELATED FIELDS, 1999, 113 (03) :301-413
[7]   MINIMUM COMPLEXITY DENSITY-ESTIMATION [J].
BARRON, AR ;
COVER, TM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (04) :1034-1054
[8]  
BARRON AR, 1991, NATO ADV SCI I C-MAT, V335, P561
[9]   On the asymptotic normality of the L2-error in partitioning regression estimation [J].
Beirlant, J ;
Gyorfi, L .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1998, 71 (1-2) :93-107
[10]   APPROXIMATION IN METRIC-SPACES AND ESTIMATION THEORY [J].
BIRGE, L .
ZEITSCHRIFT FUR WAHRSCHEINLICHKEITSTHEORIE UND VERWANDTE GEBIETE, 1983, 65 (02) :181-237