A survey of cross-validation procedures for model selection

被引:2669
作者
Arlot, Sylvain [1 ]
Celisse, Alain [2 ]
机构
[1] CNRS, Willow Project Team, Lab Informat, CNRS ENS INRIA UMR 8548,Ecole Normale Super, 23 Ave Italie, F-75214 Paris 13, France
[2] Univ Lille 1, CNRS, UMR 8524, Lab Math Paul Painleve, F-59655 Villeneuve, France
关键词
Model selection; cross-validation; leave-one-out;
D O I
10.1214/09-SS054
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its (apparent) universality. Many results exist on model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided for choosing the best cross-validation procedure according to the particular features of the problem in hand.
引用
收藏
页码:40 / 79
页数:40
相关论文
共 125 条
[71]   On the use of cross-validation to assess performance in multivariate prediction [J].
Jonathan, P ;
Krzanowski, WJ ;
McCarthy, WV .
STATISTICS AND COMPUTING, 2000, 10 (03) :209-229
[72]   An experimental and theoretical comparison of model selection methods [J].
Kearns, M ;
Mansour, Y ;
Ng, AY ;
Ron, D .
MACHINE LEARNING, 1997, 27 (01) :7-50
[73]   Algorithmic stability and sanity-check bounds for leave-one-out cross-validation [J].
Kearns, M ;
Ron, D .
NEURAL COMPUTATION, 1999, 11 (06) :1427-1453
[74]   Rademacher penalties and structural risk minimization [J].
Koltchinskii, V .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2001, 47 (05) :1902-1914
[75]   ESTIMATION OF ERROR RATES IN DISCRIMINANT ANALYSIS [J].
LACHENBR.PA ;
MICKEY, MR .
TECHNOMETRICS, 1968, 10 (01) :1-&
[76]   The shrinkage of the coefficient of e multiple correlation [J].
Larson, SC .
JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1931, 22 :45-55
[77]   Suboptimality of penalized empirical risk minimization in classification [J].
Lecue, Guillaume .
LEARNING THEORY, PROCEEDINGS, 2007, 4539 :142-156
[78]   Optimal oracle inequality for aggregation of classifiers under low noise condition [J].
Lecue, Guillaume .
LEARNING THEORY, PROCEEDINGS, 2006, 4005 :364-378
[79]  
Leung D., 1993, J NONPARAMETR STAT, V4, P333, DOI DOI 10.1080/10485259308832562
[80]   Cross-validation in nonparametric regression with outliers [J].
Leung, DHY .
ANNALS OF STATISTICS, 2005, 33 (05) :2291-2310