Model selection by bootstrap penalization for classification

被引:3
作者
Magalie Fromont
机构
[1] Université Rennes II,Laboratoire de Statistique, U.F.R. de Sciences Sociales–Département MASS
来源
Machine Learning | 2007年 / 66卷
关键词
Model selection; Classification; Bootstrap penalty; Exponential inequality; Oracle inequality; Minimax risk;
D O I
暂无
中图分类号
学科分类号
摘要
We consider the binary classification problem. Given an i.i.d. sample drawn from the distribution of an χ×{0,1}−valued random pair, we propose to estimate the so-called Bayes classifier by minimizing the sum of the empirical classification error and a penalty term based on Efron’s or i.i.d. weighted bootstrap samples of the data. We obtain exponential inequalities for such bootstrap type penalties, which allow us to derive non-asymptotic properties for the corresponding estimators. In particular, we prove that these estimators achieve the global minimax risk over sets of functions built from Vapnik-Chervonenkis classes. The obtained results generalize Koltchinskii (2001) and Bartlett et al.’s (2002) ones for Rademacher penalties that can thus be seen as special examples of bootstrap type penalties. To illustrate this, we carry out an experimental study in which we compare the different methods for an intervals model selection problem.
引用
收藏
页码:165 / 207
页数:42
相关论文
共 43 条
[1]  
Azuma K.(1967)Weighted sums of certain dependent random variables Tohoku Math. J. 19 357-367
[2]  
Barron A.(1991)Complexity regularization with application to artificial neural networks Nonparametric functional estimation and related topics (NATO ASI Ser.), C 335 561-576
[3]  
Barron A.(1991)Minimum complexity density estimation IEEE Trans. Inf. Theory 37 1034-1054
[4]  
Cover T.(2002)Model selection and error estimation Mach. Learn. 48 85-113
[5]  
Bartlett P.(2005)Localized Rademacher complexities Ann. Stat. 33 1497-1537
[6]  
Boucheron S.(1998)Minimum contrast estimators on sieves: Exponential bounds and rates of convergence Bernoulli 4 329-375
[7]  
Lugosi G.(2000)A sharp concentration inequality with applications Random Struct. Algorithms 16 277-292
[8]  
Bartlett P.(1996)Learning by canonical smooth estimation. I: Simultaneous estimation, II: Learning and choice of model complexity IEEE Trans. Autom. Control 41 545-556
[9]  
Bousquet O.(1982)Bounds for the uniform deviation of empirical measures J. Multivariate Anal. 12 72-79
[10]  
Mendelson S.(1995)Lower bounds in pattern recognition and learning Pattern Recognition 28 1011-1018