Online learning dynamics of multilayer perceptrons with unidentifiable parameters

被引:20
作者
Park, H
Inoue, M
Okada, M
机构
[1] RIKEN, Brain Sci Inst, Lab Math Neurosci, Wako, Saitama 3510198, Japan
[2] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, Kyoto 6068507, Japan
[3] RIKEN, JST, PRESTO, BSI, Wako, Saitama 3510198, Japan
来源
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL | 2003年 / 36卷 / 47期
关键词
D O I
10.1088/0305-4470/36/47/004
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In the over-realizable learning scenario of multilayer perceptrons, in which the student network has a larger number of hidden units than the true or optimal network, some of the weight parameters are unidentifiable. In this case, the teacher network consists of a union of optimal subspaces included in the parameter space. The optimal subspaces, which lead to singularities, are known to affect the estimation performance of neural networks. Using statistical mechanics, we investigate the online learning dynamics of two-layer neural networks in the over-realizable scenario with unidentifiable parameters. We show that the convergence speed strongly depends on the initial parameter conditions. We also show that there is a quasi-plateau around the optimal subspace, which differs from the well-known plateaus caused by permutation symmetry. In addition, we discuss the property of the final learning state, relating this to the singular structures.
引用
收藏
页码:11753 / 11764
页数:12
相关论文
共 12 条
  • [1] AMARI S, 2003, ADV NIPS, V14, P343
  • [2] Learning and inference in hierarchical models with singularities
    Amari, Shun-Ichi
    Ozeki, Tomoko
    Park, Hyeyoung
    [J]. Systems and Computers in Japan, 2003, 34 (07) : 34 - 42
  • [3] Transient dynamics of on-line learning in two-layered neural networks
    Biehl, M
    Riegler, P
    Wohler, C
    [J]. JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1996, 29 (16): : 4769 - 4780
  • [4] LEARNING BY ONLINE GRADIENT DESCENT
    BIEHL, M
    SCHWARZE, H
    [J]. JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1995, 28 (03): : 643 - 656
  • [5] Likelihood ratio of unidentifiable models and multilayer neural networks
    Fukumizu, K
    [J]. ANNALS OF STATISTICS, 2003, 31 (03) : 833 - 851
  • [6] On the problem in model selection of neural network regression in overrealizable scenario
    Hagiwara, K
    [J]. NEURAL COMPUTATION, 2002, 14 (08) : 1979 - 2002
  • [7] HAGIWARA K, 2000, P IJCNN2000, V4, P461
  • [8] On-line learning theory of soft committee machines with correlated hidden units - Steepest gradient descent and natural gradient descent
    Inoue, M
    Park, H
    Okada, M
    [J]. JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2003, 72 (04) : 805 - 810
  • [9] ONLINE BACKPROPAGATION IN 2-LAYERED NEURAL NETWORKS
    RIEGLER, P
    BIEHL, M
    [J]. JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1995, 28 (20): : L507 - L513
  • [10] RIEGLER P, 1997, THESIS BAYERISCHE JU