A BIAS CORRECTION FOR THE MINIMUM ERROR RATE IN CROSS-VALIDATION

被引:69
作者
Tibshirani, Ryan J. [1 ]
Tibshirani, Robert [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Cross-validation; prediction error estimation; optimism estimation;
D O I
10.1214/08-AOAS224
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Tuning parameters in supervised learning problems are often estimated by cross-validation. The minimum value of the cross-validation error can be biased downward as an estimate of the test error at that same value of the tuning parameter. We propose a simple method for the estimation of this bias that uses information from the cross-validation process. As a result, it requires essentially no additional computation. We apply our bias estimate to a number of popular classifiers in various settings, and examine its performance.
引用
收藏
页码:822 / 829
页数:8
相关论文
共 7 条
[1]  
[Anonymous], 1994, An introduction to the bootstrap: CRC press
[2]  
Breiman L., 1984, BIOMETRICS, V40, P874, DOI 10.1201/9781315139470
[3]   1977 RIETZ LECTURE - BOOTSTRAP METHODS - ANOTHER LOOK AT THE JACKKNIFE [J].
EFRON, B .
ANNALS OF STATISTICS, 1979, 7 (01) :1-26
[4]  
EFRON B, 2008, EMPIRICAL BAYES ESTI
[5]  
STONE M, 1977, BIOMETRIKA, V64, P29
[6]   Diagnosis of multiple cancer types by shrunken centroids of gene expression [J].
Tibshirani, R ;
Hastie, T ;
Narasimhan, B ;
Chu, G .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6567-6572
[7]   Bias in error estimation when using cross-validation for model selection [J].
Varma, S ;
Simon, R .
BMC BIOINFORMATICS, 2006, 7 (1)