STATISTICAL-ENSEMBLE THEORY OF REDUNDANCY REDUCTION AND THE DUALITY BETWEEN UNSUPERVISED AND SUPERVISED NEURAL LEARNING

被引:8
作者
DECO, G
SCHURMANN, B
机构
[1] Siemens AG, Corporate Research and Development, ZFE T SN 4, 81739 Munich
来源
PHYSICAL REVIEW E | 1995年 / 52卷 / 06期
关键词
D O I
10.1103/PhysRevE.52.6580
中图分类号
O35 [流体力学]; O53 [等离子体物理学];
学科分类号
070204 ; 080103 ; 080704 ;
摘要
The aim of this paper is twofold. First, we derive a statistical-mechanics-based model of unsupervised learning defined by redundancy reduction between the output components of neural nets and entropy conservation from inputs to outputs. We obtain an approximate expression for the probability distribution of the output components for a new data point, which is essentially determined by the probability distribution given by the best network of neural ensembles and by the square root of the ratio between the determinants of the Fisher information without and with the new point. Second, we pose the problem of supervised learning as an unsupervised one. The ensemble theory derived for unsupervised learning results in one for supervised learning by using the ensemble theory based on the maximum-likelihood principle. An upper bound for the prediction probability of a new point not included in the training data is derived. This upper bound is essentially given by the ratio between the Fisher information, determined for the training sets without and with the inclusion of the new point. This upper bound may be used as a mechanism to decide actively on the novelty of new data (mechanism of query learning). An illustrative example is given for the case where the training error possesses a Gaussian distribution.
引用
收藏
页码:6580 / 6587
页数:8
相关论文
共 26 条
[1]  
[Anonymous], 2006, ELEM INF THEORY
[2]  
BARLOW H, 1959, NATIONAL PHYSICAL LA, V10
[3]   Unsupervised Learning [J].
Barlow, H. B. .
NEURAL COMPUTATION, 1989, 1 (03) :295-311
[4]   NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES [J].
BAUM, EB .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (01) :5-19
[5]  
Beck C., 1993, CAMBRIDGE NONLINEAR
[6]   NONLINEAR HIGHER-ORDER STATISTICAL DECORRELATION BY VOLUME-CONSERVING NEURAL ARCHITECTURES [J].
DECO, G ;
BRAUER, W .
NEURAL NETWORKS, 1995, 8 (04) :525-535
[7]   LEARNING TIME-SERIES EVOLUTION BY UNSUPERVISED EXTRACTION OF CORRELATIONS [J].
DECO, G ;
SCHURMANN, B .
PHYSICAL REVIEW E, 1995, 51 (03) :1780-1790
[8]  
Denker J., 1987, Complex Systems, V1, P877
[9]  
Fedorov V. V., 1972, THEORY OPTIMAL EXPT
[10]   QUERY-BASED LEARNING APPLIED TO PARTIALLY TRAINED MULTILAYER PERCEPTRONS [J].
HWANG, JN ;
CHOI, JJ ;
OH, S ;
MARKS, RJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (01) :131-136