Model selection for small sample regression

被引：89

作者：

Chapelle, O

Vapnik, V

Bengio, Y

机构：

[1] LIP6, F-75015 Paris, France

[2] AT&T Labs Res, Middletown, NJ 07748 USA

[3] Univ Montreal, Dept IRO, Montreal, PQ H3C 3J7, Canada

来源：

MACHINE LEARNING | 2002年 / 48卷 / 1-3期

关键词：

model selection; parametric regression; uniform convergence bounds;

D O I：

10.1023/A:1013943418833

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Model selection is an important ingredient of many machine learning algorithms, in particular when the sample size in small, in order to strike the right trade-off between overfitting and underfitting. Previous classical results for linear regression are based on an asymptotic analysis. We present a new penalization method for performing model selection for regression that is appropriate even for small samples. Our penalization is based on an accurate estimator of the ratio of the expected training error and the expected generalization error, in terms of the expected eigenvalues of the input covariance matrix.

引用

页码：9 / 23

页数：15

共 15 条

[1]

AKAIKE H, 1970, ANN I STAT MATH, V22, P202

[2]

AKAIKE H, 1973, INT S INFORMATION TH, V2, P267

[3]

[Anonymous], 1982, ESTIMATION DEPENDENC

[4] The minimum description length principle in coding and modeling [J].