LOCAL AND GLOBAL CONVERGENCE OF ONLINE LEARNING

被引:18
作者
BARKAI, N
SEUNG, HS
SOMPOLINSKY, H
机构
[1] HEBREW UNIV JERUSALEM, CTR NEURAL COMPUTAT, IL-91904 JERUSALEM, ISRAEL
[2] AT&T BELL LABS, MURRAY HILL, NJ 07974 USA
关键词
D O I
10.1103/PhysRevLett.75.1415
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
We study the performance of a generalized perceptron algorithm for learning realizable dichotomies, with an error-dependent adaptive learning rate. The asymptotic scaling form of the solution to the associated Markov equations is derived, assuming certain smoothness conditions. We show that the system converges to the optimal solution and the generalization error asymptotically obeys a universal inverse power law in the number of examples. The system is capable of escaping from local minima and adapts rapidly to shifts in the target function. The general theory is illustrated for the perceptron and committee machine.
引用
收藏
页码:1415 / 1418
页数:4
相关论文
共 13 条
[1]   4 TYPES OF LEARNING-CURVES [J].
AMARI, S ;
FUJITA, N ;
SHINOMOTO, S .
NEURAL COMPUTATION, 1992, 4 (04) :605-618
[2]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[3]  
[Anonymous], 1978, STOCHASTIC APPROXIMA
[4]  
BARKAI N, IN PRESS ADV NEURAL, V7
[5]  
BIEHL M, IN PRESS
[6]   STOCHASTIC DYNAMICS OF SUPERVISED LEARNING [J].
HANSEN, LK ;
PATHRIA, R ;
SALAMON, P .
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1993, 26 (01) :63-71
[7]   LEARNING IN NEURAL NETWORKS WITH LOCAL MINIMA [J].
HESKES, TM ;
SLIJPEN, ETP ;
KAPPEN, B .
PHYSICAL REVIEW A, 1992, 46 (08) :5221-5231
[8]   LEARNING-PROCESSES IN NEURAL NETWORKS [J].
HESKES, TM ;
KAPPEN, B .
PHYSICAL REVIEW A, 1991, 44 (04) :2718-2726
[9]   GENERALIZATION IN A 2-LAYER NEURAL-NETWORK [J].
KANG, KJ ;
OH, JH ;
KWON, C ;
PARK, Y .
PHYSICAL REVIEW E, 1993, 48 (06) :4805-4809