On the problem in model selection of neural network regression in overrealizable scenario

被引:31
作者
Hagiwara, K [1 ]
机构
[1] Mie Univ, Fac Phys Engn, Tsu, Mie 5148507, Japan
关键词
D O I
10.1162/089976602760128090
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In considering a statistical model selection of neural networks and radial basis functions under an overrealizable case, the problem of unidentifiability emerges. Because the model selection criterion is an unbiased estimator of the generalization error based on the training error, this article analyzes the expected training error and the expected generalization error of neural networks and radial basis functions in overrealizable cases and clarifies the difference from regular models, for which identifiability holds. As a special case of an overrealizable scenario, we assumed a gaussian noise sequence as training data. In the least-squares estimation under this assumption, we first formulated the problem, in which the calculation of the expected errors of unidentifiable networks is reduced to the calculation of the expectation of the supremum of the chi(2) process. Under this formulation, we gave an upper bound of the expected training error and a lower bound of the expected generalization error, where the generalization is measured at a set of training inputs. Furthermore, we gave stochastic bounds on the training error and the generalization error. The obtained upper bound of the expected training error is smaller than in regular models, and the lower bound of the expected generalization error is larger than in regular models. The result tells us that the degree of overfitting in neural networks and radial basis functions is higher than in regular models. Correspondingly, it also tells us that the generalization capability is worse than in the case of regular models. The article may be enough to show a difference between neural networks and regular models in the context of the least-squares estimation in a simple situation. This is a first step in constructing a model selection criterion in an overrealizable case. Further important problems in this direction are also included in this article.
引用
收藏
页码:1979 / 2002
页数:24
相关论文
共 31 条
[1]   FITTING AUTOREGRESSIVE MODELS FOR PREDICTION [J].
AKAIKE, H .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1969, 21 (02) :243-&
[2]  
AKAIKE H, 1973, INT S INFORMATION TH, V2, P267
[3]  
Akaike H., 1977, Applications of statistics
[4]   Natural gradient works efficiently in learning [J].
Amari, S .
NEURAL COMPUTATION, 1998, 10 (02) :251-276
[5]  
Amari S, 2001, IEICE T FUND ELECTR, VE84A, P31
[6]  
[Anonymous], ESAIM PROBAB STAT
[7]  
Barron A., 1984, SELF ORG METHODS MOD, P87
[8]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[9]  
Breiman L., 1998, Neural Networks and Machine Learning. Proceedings, P27
[10]  
Dacunha-Castelle D, 1999, ANN STAT, V27, P1178