Second-order learning algorithm with squared penalty term

被引:39
作者
Saito, K [1 ]
Nakano, R [1 ]
机构
[1] Nippon Telegraph & Tel Corp, Commun Sci Labs, Kyoto 6190237, Japan
关键词
D O I
10.1162/089976600300015763
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article compares three penalty terms with respect to the efficiency of supervised learning, by using first- and second-order off-line learning algorithms and a first-order on-line algorithm. Our experiments showed that for a reasonably adequate penalty factor, the combination of the squared penalty term and the second-order learning algorithm drastically improves the convergence performance in comparison to the other combinations, at the same time bringing about excellent generalization performance. Moreover, in order to understand how differently each penalty term works, a function surface evaluation is described. Finally, we show how cross validation can be applied to find an optimal penalty factor.
引用
收藏
页码:709 / 729
页数:21
相关论文
共 16 条
[1]   1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD [J].
BATTITI, R .
NEURAL COMPUTATION, 1992, 4 (02) :141-166
[2]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[3]  
GLUCKSMAN H, 1967, IEEE COMP C, P138
[4]  
HINTON GE, 1987, LECT NOTES COMPUT SC, V258, P1
[5]  
ISHII K, 1983, SYSTEMS COMPUTERS JA, V14, P19
[6]   Structural learning with forgetting [J].
Ishikawa, M .
NEURAL NETWORKS, 1996, 9 (03) :509-521
[7]  
Jose Stephen., 1989, ADV NEURAL INFORM PR, P177, DOI 10.5555/2987061.2987082
[8]   A PRACTICAL BAYESIAN FRAMEWORK FOR BACKPROPAGATION NETWORKS [J].
MACKAY, DJC .
NEURAL COMPUTATION, 1992, 4 (03) :448-472
[9]   A SCALED CONJUGATE-GRADIENT ALGORITHM FOR FAST SUPERVISED LEARNING [J].
MOLLER, MF .
NEURAL NETWORKS, 1993, 6 (04) :525-533
[10]   REGULARIZATION ALGORITHMS FOR LEARNING THAT ARE EQUIVALENT TO MULTILAYER NETWORKS [J].
POGGIO, T ;
GIROSI, F .
SCIENCE, 1990, 247 (4945) :978-982