How initial conditions affect generalization performance in large networks

被引:40
作者
Atiya, A [1 ]
Ji, CY [1 ]
机构
[1] RENSSELAER POLYTECH INST,DEPT ELECT COMP & SYST ENGN,TROY,NY 12180
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1997年 / 8卷 / 02期
基金
美国国家科学基金会;
关键词
D O I
10.1109/72.557701
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalization is one of the most important problems in neural-network research, It is influenced by several factors in the network design, such as network size, weight decay factor, and others, We show here that the initial weight distribution (for gradient decent training algorithms) is one other factor that influences generalization, The initial conditions guide the training algorithm to search particular places of the weight space, For instance small initial weights tend to result in low complexity networks, and therefore can effectively act as a regularization factor. We propose a novel network complexity measure, which is helpful in shedding insight into the phenomenon, as well as in studying other aspects of generalization.
引用
收藏
页码:448 / 451
页数:4
相关论文
共 8 条
[1]  
Abu-Mostafa Y. S., 1990, Journal of Complexity, V6, P192, DOI 10.1016/0885-064X(90)90006-Y
[2]   STATISTICAL-THEORY OF LEARNING-CURVES UNDER ENTROPIC LOSS CRITERION [J].
AMARI, S ;
MURATA, N .
NEURAL COMPUTATION, 1993, 5 (01) :140-153
[3]   Temporal Evolution of Generalization during Learning in Linear Networks [J].
Baldi, Pierre ;
Chauvin, Yves .
NEURAL COMPUTATION, 1991, 3 (04) :589-603
[4]  
JI C, 1993, ADV NEURAL INFORMATI, V5
[5]  
MOODY J, 1992, ADV NEURAL INFORMATI, V4
[6]   LEARNING AND CONVERGENCE ANALYSIS OF NEURAL-TYPE STRUCTURED NETWORKS [J].
POLYCARPOU, MM ;
IOANNOU, PA .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (01) :39-50
[7]  
SOLLA S, 1993, ADV NEURAL INFORMATI, V5
[8]  
WEIGEND A, 1991, ADV NEURAL INFORMATI, V3