A THEORY OF CROSS-VALIDATION ERROR

被引:15
作者
TURNEY, P
机构
[1] Knowledge Systems Laboratory, Institute for Information Technology, National Research Council Canada, Ottawa, ON
关键词
CROSS-VALIDATION; SIMPLICITY; BIAS; VARIANCE; AIC; LINEAR REGRESSION; INSTANCE-BASED LEARNING;
D O I
10.1080/09528139408953794
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a theory of error in cross-validation testing of algorithms for predicting real-valued attributes. The theory justifies the claim that predicting real-valued attributes requires balancing the conflicting demands of simplicity and accuracy. Furthermore, the theory indicates precisely how these conflicting demands must be balanced, in order to minimize cross-validation error. A general theory is presented, then it is developed in detail for linear regression and instance-based learning
引用
收藏
页码:361 / 391
页数:31
相关论文
共 17 条
[1]   INSTANCE-BASED LEARNING ALGORITHMS [J].
AHA, DW ;
KIBLER, D ;
ALBERT, MK .
MACHINE LEARNING, 1991, 6 (01) :37-66
[2]  
AHA DW, 1989, 11TH P INT JOINT C A, P794
[3]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[4]   STATISTICAL PREDICTOR IDENTIFICATION [J].
AKAIKE, H .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1970, 22 (02) :203-&
[5]  
Akaike H., 1992, 2 INT S INF THEOR, P267, DOI DOI 10.1007/978-1-4612-1694-0_15
[6]  
Barron A.R., 1984, SELF ORG METHODS MOD, P87
[7]  
Dasarathy B., 1991, NEAREST NEIGHBOR PAT
[8]  
DRAPER NR, 1981, APPLIED REGRESSION A
[9]   ATTRIBUTES OF THE PERFORMANCE OF CENTRAL PROCESSING UNITS - A RELATIVE PERFORMANCE PREDICTION MODEL [J].
EINDOR, P ;
FELDMESSER, J .
COMMUNICATIONS OF THE ACM, 1987, 30 (04) :308-317
[10]  
Eubank R.L., 1988, SPLINE SMOOTHING NON