Towards adaptive learning with improved convergence of deep belief networks on graphics processing units

被引:72
作者
Lopes, Noel [1 ]
Ribeiro, Bernardete [1 ]
机构
[1] Univ Coimbra, Dept Informat Engn, P-3000 Coimbra, Portugal
关键词
Deep learning; Deep belief networks; Restricted Boltzmann machines; Contrastive divergence; Adaptive step size; GPU computing;
D O I
10.1016/j.patcog.2013.06.029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we focus on two complementary approaches to significantly decrease pre-training time of a deep belief network (DBN). First, we propose an adaptive step size technique to enhance the convergence of the contrastive divergence (CD) algorithm, thereby reducing the number of epochs to train the restricted Boltzmann machine (RBM) that supports the DBN infrastructure. Second, we present a highly scalable graphics processing unit (GPU) parallel implementation of the CD-k algorithm, which boosts notably the training speed. Additionally, extensive experiments are conducted on the MNIST and the HHreco databases. The results suggest that the maximum useful depth of a DBN is related to the number and quality of the training samples. Moreover, it was found that the lower-level layer plays a fundamental role for building successful DBN models. Furthermore, the results contradict the preconceived idea that all the layers should be pre-trained. Finally, it is shown that by incorporating multiple back-propagation (MBP) layers, the DBNs generalization capability is remarkably improved. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:114 / 127
页数:14
相关论文
共 34 条
[1]  
Almeida L.B., 1997, HDB NEURAL COMPUTATI
[2]  
[Anonymous], 2005, AISTATS BRIDGETOWN B
[3]  
[Anonymous], 2008, P 25 INT C MACHINE L, DOI DOI 10.1145/1390156.1390170
[4]  
[Anonymous], 2009, Microsoft Research
[5]  
[Anonymous], TECHNICAL REPORT
[6]  
[Anonymous], 2007, P 20 INT C NEURAL IN
[7]  
[Anonymous], 2007, IEEE INT C ICML
[8]  
[Anonymous], 2009, P 26 ANN INT C MACHI, DOI DOI 10.1145/1553374.1553453
[9]  
[Anonymous], 2009, ICML
[10]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127