Learning Deep Architectures for AI

被引:5922
作者
Bengio, Yoshua [1 ]
机构
[1] Univ Montreal, Dept IRO, CP 6128, Montreal H3C 3J7, PQ, Canada
来源
FOUNDATIONS AND TRENDS IN MACHINE LEARNING | 2009年 / 2卷 / 01期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1561/2200000006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Theoretical results suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g., in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the stateof- the-art in certain areas. This monograph discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
引用
收藏
页码:1 / 127
页数:127
相关论文
共 205 条
  • [1] ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
  • [2] Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks
    Ahmed, Amr
    Yu, Kai
    Xu, Wei
    Gong, Yihong
    Xing, Eric
    [J]. COMPUTER VISION - ECCV 2008, PT III, PROCEEDINGS, 2008, 5304 : 69 - +
  • [3] Allgower E. L., 1980, SPRINGER SERIES COMP
  • [4] An introduction to MCMC for machine learning
    Andrieu, C
    de Freitas, N
    Doucet, A
    Jordan, MI
    [J]. MACHINE LEARNING, 2003, 50 (1-2) : 5 - 43
  • [5] An energy budget for signaling in the grey matter of the brain
    Attwell, D
    Laughlin, SB
    [J]. JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2001, 21 (10) : 1133 - 1145
  • [6] Bagnell J. A., 2009, ADV NEURAL INFORM PR
  • [7] Baxter J., 1995, Proceedings of the Eighth Annual Conference on Computational Learning Theory, P311, DOI 10.1145/225298.225336
  • [8] A Bayesian information theoretic model of learning to learn via multiple task sampling
    Baxter, J
    [J]. MACHINE LEARNING, 1997, 28 (01) : 7 - 39
  • [9] Regularization and semi-supervised learning on large graphs
    Belkin, M
    Matveeva, I
    Niyogi, P
    [J]. LEARNING THEORY, PROCEEDINGS, 2004, 3120 : 624 - 638
  • [10] Belkin M., 2003, ADV NEURAL INFORM PR, V15