Regularization tools for training large feed-forward neural networks using automatic differentiation

被引：6

作者：

Eriksson, J ^{[1
]}

Gulliksson, M ^{[1
]}

Lindström, P ^{[1
]}

Wedin, PA ^{[1
]}

机构：

[1] Umea Univ, Dept Comp Sci, S-90187 Umea, Sweden

来源：

OPTIMIZATION METHODS & SOFTWARE | 1998年 / 10卷 / 01期

关键词：

neural network training; Tikhonov regularization; automatic differentiation; large-scale problems;

D O I：

10.1080/10556789808805701

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 [计算机软件与理论]; 0835 [软件工程];

摘要：

We describe regularization tools for training large-scale artificial feed-forward neural networks. We propose algorithms that explicitly use a sequence of Tikhonov regularized nonlinear least squares problems. For large-scare problems, methods using new special purpose automatic differentiation are used in a conjugate gradient method for computing a truncated Gauss-Newton search direction. The algorithms developed utilize the structure of the problem in different ways and perform much better than a Polak-Ribiere based method. All algorithms are tested using benchmark problems and guidelines by Lutz Prechelt in the Probenl package. All software is written in Matlab and gathered in a toolbox.

引用

页码：49 / 69

页数：21

共 21 条

[1]

[Anonymous], MATLAB NEURAL NETWOR

[2]

Beale E.M.L., 1972, FA Lootsma ed, P39

[3]

BJORCK A, 1994, BIT, V34, P510, DOI 10.1007/BF01934265

[4]

INEXACT NEWTON METHODS [J].

DEMBO, RS ;

EISENSTAT, SC ;

STEIHAUG, T .

SIAM JOURNAL ON NUMERICAL ANALYSIS, 1982, 19 (02) :400-408

[5]

DENNIS JE, 1989, VIEW UNCONSTRAINED O, P1

[6]

DEUFLHARD P, 1978, LECT NOTES CONTROL I, V7, P22

[7]

DIXON LCW, 1992, OPTIMIZATION METHODS, V1, P141

[8]

ERIKSSON J, 1996, UMINF9605 UM U DEP C

[9]

GLOBAL CONVERGENCE PROPERTIES OF CONJUGATE GRADIENT METHODS FOR OPTIMIZATION [J].

Gilbert, Jean Charles ;

Nocedal, Jorge .

SIAM JOURNAL ON OPTIMIZATION, 1992, 2 (01) :21-42

[10]

GRIEWANK A, 1990, SIAM PROC S, P115

← 1 2 3 →