Regularization tools for training large feed-forward neural networks using automatic differentiation

被引:6
作者
Eriksson, J [1 ]
Gulliksson, M [1 ]
Lindström, P [1 ]
Wedin, PA [1 ]
机构
[1] Umea Univ, Dept Comp Sci, S-90187 Umea, Sweden
关键词
neural network training; Tikhonov regularization; automatic differentiation; large-scale problems;
D O I
10.1080/10556789808805701
中图分类号
TP31 [计算机软件];
学科分类号
081202 [计算机软件与理论]; 0835 [软件工程];
摘要
We describe regularization tools for training large-scale artificial feed-forward neural networks. We propose algorithms that explicitly use a sequence of Tikhonov regularized nonlinear least squares problems. For large-scare problems, methods using new special purpose automatic differentiation are used in a conjugate gradient method for computing a truncated Gauss-Newton search direction. The algorithms developed utilize the structure of the problem in different ways and perform much better than a Polak-Ribiere based method. All algorithms are tested using benchmark problems and guidelines by Lutz Prechelt in the Probenl package. All software is written in Matlab and gathered in a toolbox.
引用
收藏
页码:49 / 69
页数:21
相关论文
共 21 条
[1]
[Anonymous], MATLAB NEURAL NETWOR
[2]
Beale E.M.L., 1972, FA Lootsma ed, P39
[3]
BJORCK A, 1994, BIT, V34, P510, DOI 10.1007/BF01934265
[4]
INEXACT NEWTON METHODS [J].
DEMBO, RS ;
EISENSTAT, SC ;
STEIHAUG, T .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 1982, 19 (02) :400-408
[5]
DENNIS JE, 1989, VIEW UNCONSTRAINED O, P1
[6]
DEUFLHARD P, 1978, LECT NOTES CONTROL I, V7, P22
[7]
DIXON LCW, 1992, OPTIMIZATION METHODS, V1, P141
[8]
ERIKSSON J, 1996, UMINF9605 UM U DEP C
[9]
GLOBAL CONVERGENCE PROPERTIES OF CONJUGATE GRADIENT METHODS FOR OPTIMIZATION [J].
Gilbert, Jean Charles ;
Nocedal, Jorge .
SIAM JOURNAL ON OPTIMIZATION, 1992, 2 (01) :21-42
[10]
GRIEWANK A, 1990, SIAM PROC S, P115