Variational learning in nonlinear Gaussian belief networks

被引:48
作者
Frey, BJ [1 ]
Hinton, GE
机构
[1] Univ Illinois, Beckman Inst, Urbana, IL 61801 USA
[2] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
关键词
D O I
10.1162/089976699300016872
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We view perceptual tasks such as vision and speech recognition as inference problems where the goal is to estimate the posterior distribution over latent variables (e.g., depth in stereo vision) given the sensory input. The recent flurry of research in independent component analysis exemplifies the importance of inferring the continuous-valued latent variables of input data. The latent variables found by this method are linearly related to the input, but perception requires nonlinear inferences such as classification and depth estimation. In this article, we present a unifying framework for stochastic neural networks with nonlinear latent variables. Nonlinear units are obtained by passing the outputs of linear gaussian units through various nonlinearities. We present a general variational method that maximizes a lower bound on the likelihood of a training set and give results on two visual feature extraction problems. We also show how the variational method can be used for pattern classification and compare the performance of these nonlinear networks with other methods on the problem of handwritten digit recognition.
引用
收藏
页码:193 / 213
页数:21
相关论文
共 27 条
[1]  
AMARI S, 1996, ADV NEURAL INFORMATI
[2]  
AMARI SI, 1985, DIFFERNTIAL GEOMETRI
[3]  
[Anonymous], GRAPHICAL MODELS MAC
[4]   SELF-ORGANIZING NEURAL NETWORK THAT DISCOVERS SURFACES IN RANDOM-DOT STEREOGRAMS [J].
BECKER, S ;
HINTON, GE .
NATURE, 1992, 355 (6356) :161-163
[5]   AN INFORMATION MAXIMIZATION APPROACH TO BLIND SEPARATION AND BLIND DECONVOLUTION [J].
BELL, AJ ;
SEJNOWSKI, TJ .
NEURAL COMPUTATION, 1995, 7 (06) :1129-1159
[6]   BLIND SEPARATION OF SOURCES .2. PROBLEMS STATEMENT [J].
COMON, P ;
JUTTEN, C ;
HERAULT, J .
SIGNAL PROCESSING, 1991, 24 (01) :11-20
[7]   THE HELMHOLTZ MACHINE [J].
DAYAN, P ;
HINTON, GE ;
NEAL, RM ;
ZEMEL, RS .
NEURAL COMPUTATION, 1995, 7 (05) :889-904
[8]   COMPETITION AND MULTIPLE CAUSE MODELS [J].
DAYAN, P ;
ZEMEL, RS .
NEURAL COMPUTATION, 1995, 7 (03) :565-579
[9]  
Everitt BS., 1984, INTRO LATENT VARIABL
[10]  
Frey B. J., 1997, 6 INT WORKSH ART INT