UNSUPERVISED MUTUAL INFORMATION CRITERION FOR ELIMINATION OF OVERTRAINING IN SUPERVISED MULTILAYER NETWORKS

被引：54

作者：

DECO, G

FINNOFF, W

ZIMMERMANN, HG

机构：

来源：

NEURAL COMPUTATION | 1995年 / 7卷 / 01期

关键词：

D O I：

10.1162/neco.1995.7.1.86

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Controlling the network complexity in order to prevent overfitting is one of the major problems encountered when using neural network models to extract the structure from small data sets. In this paper we present a network architecture designed for use with a cost function that includes a novel complexity penalty term. In this architecture the outputs of the hidden units are strictly positive and sum to one, and their outputs are defined as the probability that the actual input belongs to a certain class formed during learning. The penalty term expresses the mutual information between the inputs and the extracted classes. This measure effectively describes the network complexity with respect to the given data in an unsupervised fashion. The efficiency of this architechture/penalty-term when combined with back-propagation training, is demonstrated on a real world economic time series forecasting problem. The model was also applied to the bench-mark sunspot data and to a synthetic data set from the statistics community.

引用

页码：86 / 107

页数：22

共 29 条

[1] [Anonymous], 1990, ADV NEURAL INF PROCE
[2] Unsupervised Learning
Barlow, H. B.
[J]. NEURAL COMPUTATION, 1989, 1 (03) : 295 - 311
[3] BRIDLE J, 1990, NEURAL INFORMATION P, V2, P11
[4] BRIDLE J, 1991, NEURAL INFORMATION P, V4, P1096
[5] COARSE CODING RESOURCE-ALLOCATING NETWORK
DECO, G
EBMEYER, J
[J]. NEURAL COMPUTATION, 1993, 5 (01) : 105 - 114
[6] FINNOFF W, 1991, IN PRES 2ND P ANN WO
[7] MULTIVARIATE ADAPTIVE REGRESSION SPLINES
FRIEDMAN, JH
[J]. ANNALS OF STATISTICS, 1991, 19 (01) : 1 - 67
[8] GERSHENFELD N, 1993, TIME SERIES PREDICTI, P170
[9] Guyon I., 1991, P ADV NEUR INF PROC, P471
[10] HANSON S, 1989, ADV NEURAL INFORMATI, V2, P533

← 1 2 3 →