Bounding the Bias of Contrastive Divergence Learning

被引：38

作者：

Fischer, Asja ^{[1
]}

Igel, Christian ^{[2
]}

机构：

[1] Ruhr Univ Bochum, Inst Neuroinformat, D-44780 Bochum, Germany

[2] Univ Copenhagen, Dept Comp Sci, DK-2100 Copenhagen O, Denmark

来源：

NEURAL COMPUTATION | 2011年 / 23卷 / 03期

关键词：

D O I：

10.1162/NECO_a_00085

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Optimization based on k-step contrastive divergence (CD) has become a common way to train restricted Boltzmann machines (RBMs). The k-step CD is a biased estimator of the log-likelihood gradient relying on Gibbs sampling. We derive a new upper bound for this bias. Its magnitude depends on k, the number of variables in the RBM, and the maximum change in energy that can be produced by changing a single variable. The last reflects the dependence on the absolute values of the RBM parameters. The magnitude of the bias is also affected by the distance in variation between the modeled distribution and the starting distribution of the Gibbs chain.

引用

页码：664 / 673

页数：10

共 14 条

[1]

[Anonymous], 2010, JMLR WORKSHOP C P AI

[2]

Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, DOI DOI 10.7551/MITPRESS/7503.003.0024

[3] Justifying and Generalizing Contrastive Divergence [J].

Bengio, Yoshua ;

Delalleau, Olivier .

NEURAL COMPUTATION, 2009, 21 (06) :1601-1621

[4]

Bremaud P., 1999, TEXTS APPL MATH

[5]

CARREIRAPERPINA.MA, 2005, 10 WORKSH ART INT ST, P59

[6]

FISCHER A, 2009, FRONTIERS COMPUTATIO, DOI DOI 10.3389/CONF.NEURO.10.2009.14.121

[7]

Fischer A, 2010, LECT NOTES COMPUT SC, V6354, P208, DOI 10.1007/978-3-642-15825-4_26

[8] Reducing the dimensionality of data with neural networks [J].

Hinton, G. E. ;

Salakhutdinov, R. R. .

SCIENCE, 2006, 313 (5786) :504-507

[9] Training products of experts by minimizing contrastive divergence [J].

Hinton, GE .

NEURAL COMPUTATION, 2002, 14 (08) :1771-1800

[10] Learning multiple a layers of representation [J].

Hinton, Geoffrey E. .

TRENDS IN COGNITIVE SCIENCES, 2007, 11 (10) :428-434

← 1 2 →