A UNIVERSAL THEOREM ON LEARNING-CURVES

被引:73
作者
AMARI, SI
机构
关键词
LEARNING CURVE; GENERALIZATION ERROR; ENTROPIC ERROR; INFORMATION GAIN; UNIVERSAL THEOREM;
D O I
10.1016/0893-6080(93)90013-M
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A learning curve shows how fast a learning machine improves its behavior as the number of training examples increases. This paper proves a universal asymptotic behavior of learning curves for general noiseless dichotomy machines, or neural networks. It is proved that irrespective of the architecture of a machine, the average predictive entropy or the information gain [e*(t)] converges to 0 as [e*(t)] approximately d/t as the number t of training examples increases, where d is the number of modifiable parameters of a machine.
引用
收藏
页码:161 / 166
页数:6
相关论文
共 15 条
[1]   4 TYPES OF LEARNING-CURVES [J].
AMARI, S ;
FUJITA, N ;
SHINOMOTO, S .
NEURAL COMPUTATION, 1992, 4 (04) :605-618
[2]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[3]  
AMARI S, 1992, IN PRESS NEURAL COMP
[4]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[5]  
GYORGYI G, 1990, NEURAL NETWORKS SPIN, P31
[6]   LEARNING-PROCESSES IN NEURAL NETWORKS [J].
HESKES, TM ;
KAPPEN, B .
PHYSICAL REVIEW A, 1991, 44 (04) :2718-2726
[7]   A STATISTICAL APPROACH TO LEARNING AND GENERALIZATION IN LAYERED NEURAL NETWORKS [J].
LEVIN, E ;
TISHBY, N ;
SOLLA, SA .
PROCEEDINGS OF THE IEEE, 1990, 78 (10) :1568-1574
[8]  
MURATA N, 1992, METR9205 U TOK
[9]  
OPPER M, 1991, 4TH P ANN WORKSH COM, P75
[10]   STOCHASTIC COMPLEXITY AND MODELING [J].
RISSANEN, J .
ANNALS OF STATISTICS, 1986, 14 (03) :1080-1100