NONLINEAR HIGHER-ORDER STATISTICAL DECORRELATION BY VOLUME-CONSERVING NEURAL ARCHITECTURES

被引:58
作者
DECO, G
BRAUER, W
机构
[1] Siemens AG, Corporate Research and Development, ZFE ST SN 41, Otto-Hahn-Ring 6, 81739 Munich, Germany
关键词
NONLINEAR DECORRELATION; VOLUME-CONSERVING ARCHITECTURES; FACTORIAL LEARNING;
D O I
10.1016/0893-6080(94)00108-X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A neural network learning paradigm based on information theory is proposed as a way to perform, in an unsupervised fashion, redundancy reduction among the elements of the output layer without loss of information from the sensory input. The model developed performs nonlinear decorrelation up to higher orders of the cumulant tensors and results in probabilistically independent components of the output layer. This means that we don't need to assume Gaussian distribution at either the input or the output. The theory presented is related to the unsupervised learning theory of Barlow, which proposes redundancy reduction as the goal of cognition. When nonlinear units are used (sigmoid or higher-order pi-neurons), nonlinear principal component analysis is obtained. In this case, nonlinear manifolds can be reduced to minimum dimension manifolds. If such units are used the network performs a generalized principal component analysis in the sense that non-Gaussian distributions can be linearly decorrelated and higher orders of the correlation tensors are also taken into account. The basic structure of the architecture involves a general transformation that is volume conserving and therefore the entropy yielding a map without loss of information. Minimization of the mutual information among the output neurons eliminates the redundancy between the outputs and results in statistical decorrelation of the extracted features. This is known as factorial learning. To sum up, this paper presents a model of factorial learning for general nonlinear transformations of an arbitrary non-Gaussian (or Gaussian) environment with statistically nonlinearly correlated input. Simulations demonstrate the effectiveness of this method.
引用
收藏
页码:525 / 535
页数:11
相关论文
共 22 条
[1]  
ABRAHAM R, 1978, F MECHANICS
[2]  
[Anonymous], 1991, COMMUNICATIONS SIGNA
[3]   WHAT DOES THE RETINA KNOW ABOUT NATURAL SCENES [J].
ATICK, JJ ;
REDLICH, AN .
NEURAL COMPUTATION, 1992, 4 (02) :196-210
[4]   Towards a Theory of Early Visual Processing [J].
Atick, Joseph J. ;
Redlich, A. Norman .
NEURAL COMPUTATION, 1990, 2 (03) :308-320
[5]  
BARLOW H, 1959, 10 NAT PHYS LAB S ME
[6]   Finding Minimum Entropy Codes [J].
Barlow, H. B. ;
Kaushal, T. P. ;
Mitchison, G. J. .
NEURAL COMPUTATION, 1989, 1 (03) :412-423
[7]   Unsupervised Learning [J].
Barlow, H. B. .
NEURAL COMPUTATION, 1989, 1 (03) :295-311
[8]  
BARLOW HB, 1989, COMPUTING NEURON
[9]  
Bodewig E., 1956, MATRIX CALCULUS
[10]  
DECO G, 1993, UNPUB BIOL CYBERNETI