Training restricted Boltzmann machines: An introduction

被引:318
作者
Fischer, Asja [1 ,2 ]
Igel, Christian [2 ]
机构
[1] Ruhr Univ Bochum, Inst Neuroinformat, D-44780 Bochum, Germany
[2] Univ Copenhagen, Dept Comp Sci, DK-2100 Copenhagen, Denmark
关键词
Restricted Boltzmann machines; Markov random fields; Markov chains; Gibbs sampling; Neural networks; Contrastive divergence learning; Parallel tempering; LEARNING ALGORITHM;
D O I
10.1016/j.patcog.2013.05.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:25 / 39
页数:15
相关论文
共 61 条
  • [1] ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
  • [2] [Anonymous], P EUR S ART NEUR NET
  • [3] [Anonymous], 2012, P 20 EUR S ART NEUR
  • [4] [Anonymous], P INT JOINT C NEUR N
  • [5] [Anonymous], 2007, Scholarpedia, DOI [DOI 10.4249/SCHOLARPEDIA.3879, 10.4249/scholarpedia.3879.revision#137078, DOI 10.4249/SCHOLARPEDIA.3879.REVISION#137078]
  • [6] [Anonymous], 2010, P 13 INT C ARTIFICIA
  • [7] [Anonymous], 2009, Advances in neural information processing systems
  • [8] [Anonymous], 2007, Scholarpedia
  • [9] [Anonymous], FDN TRENDS MACHINE L
  • [10] [Anonymous], 1993, Probabilistic inference using Markov chain Monte Carlo methods