A generalized learning paradigm exploiting the structure of feedforward neural networks

被引:54
作者
Parisi, R [1 ]
DiClaudio, ED [1 ]
Orlandi, G [1 ]
Rao, BD [1 ]
机构
[1] UNIV CALIF SAN DIEGO, DEPT ELECT & COMP ENGN, LA JOLLA, CA 92093 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1996年 / 7卷 / 06期
基金
美国国家科学基金会;
关键词
D O I
10.1109/72.548172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper a general class of fast learning algorithms for feedforward neural networks is introduced and described, The approach exploits the separability of each layer into linear and nonlinear blocks and consists of two steps, The first step is the descent of the error functional in the space of the outputs of the linear blocks (descent in the neuron space), which can be performed using any preferred optimization strategy, In the second step, each linear block is optimized separately by using a least squares (LS) criterion. To demonstrate the effectiveness of the new approach, a detailed treatment of a gradient descent in the neuron space is conducted. The main properties of this approach are the higher speed of convergence with respect to methods that employ an ordinary gradient descent in the weight space backpropagation (BP), better numerical conditioning, and lower computational cost compared to techniques based on the Hessian matrix, The numerical stability is assured by the use of robust LS linear system solvers, operating directly on the input data of each layer. Experimental results obtained in three problems are described, which confirm the effectiveness of the new method.
引用
收藏
页码:1450 / 1460
页数:11
相关论文
共 28 条
[1]   FAST LEARNING-PROCESS OF MULTILAYER NEURAL NETWORKS USING RECURSIVE LEAST-SQUARES METHOD [J].
AZIMISADJADI, MR ;
LIOU, RJ .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (02) :446-450
[2]   1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD [J].
BATTITI, R .
NEURAL COMPUTATION, 1992, 4 (02) :141-166
[3]   ENHANCED TRAINING ALGORITHMS, AND INTEGRATED TRAINING ARCHITECTURE SELECTION FOR MULTILAYER PERCEPTRON NETWORKS [J].
BELLO, MG .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (06) :864-875
[4]   EXACT CALCULATION OF THE HESSIAN MATRIX FOR THE MULTILAYER PERCEPTRON [J].
BISHOP, C .
NEURAL COMPUTATION, 1992, 4 (04) :494-501
[5]   COMPUTING 2ND DERIVATIVES IN FEEDFORWARD NETWORKS - A REVIEW [J].
BUNTINE, WL ;
WEIGEND, AS .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (03) :480-488
[6]  
DENNIS JE, 1981, ACM T MATH SOFTWARE, V7, P348, DOI 10.1145/355958.355965
[7]  
DICLAUDIO ED, 1993, P 1993 IEEE WKSHP NE
[8]  
FAHLMAN SE, P 1988 CONN MOD SUMM
[9]  
Fletcher R., 2013, Practical Methods of Optimization, DOI [10.1002/9781118723203, DOI 10.1002/9781118723203]
[10]   ALGORITHMS FOR SOLUTION OF NON-LINEAR LEAST-SQUARES PROBLEM [J].
GILL, PE ;
MURRAY, W .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 1978, 15 (05) :977-992