A neural network training algorithm utilizing multiple sets of linear equations

被引:45
作者
Chen, HH
Manry, MT
Chandrasekaran, H
机构
[1] CYTEL Syst Inc, Hudson, MA 01749 USA
[2] Univ Texas, Dept Elect Engn, Arlington, TX 76019 USA
基金
美国国家航空航天局;
关键词
multilayer perceptron; fast training; hidden weight optimization; second-order methods; conjugate gradient method; Levenberg-Marquardt algorithm; learning factor calculation; backpropagation; output weight optimization;
D O I
10.1016/S0925-2312(98)00109-X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fast algorithm is presented for the training of multilayer perceptron neural networks, which uses separate error functions for each hidden unit and solves multiple sets of linear equations. The algorithm builds upon two previously described techniques. In each training iteration, output weight optimization (OWO) solves linear equations to optimize output weights, which are those connecting to output layer net functions. The method of hidden weight optimization (HWO) develops desired hidden unit net signals from delta functions. The resulting hidden unit error functions are minimized with respect to hidden weights, which are those feeding into hidden unit net functions. An algorithm is described for calculating the learning factor for hidden weights. We show that the combined technique, OWO-HWO is superior in terms of convergence to standard OWO-BP (output weight optimization-backpropagation) which uses OWO to update output weights and backpropagation to update hidden weights. We also show that the OWO-HWO algorithm usually converges to about the same training error as the Levenberg-Marquardt algorithm in an order of magnitude less time. (C) 1999 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:55 / 72
页数:18
相关论文
共 31 条
[1]  
ANDERSON JA, 1986, INTRO NEURAL NETWORK, P267
[2]   NEURAL NETWORKS AND PRINCIPAL COMPONENT ANALYSIS - LEARNING FROM EXAMPLES WITHOUT LOCAL MINIMA [J].
BALDI, P ;
HORNIK, K .
NEURAL NETWORKS, 1989, 2 (01) :53-58
[3]   A Matrix Method for Optimizing a Neural Network [J].
Barton, Simon A. .
NEURAL COMPUTATION, 1991, 3 (03) :450-459
[4]   1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD [J].
BATTITI, R .
NEURAL COMPUTATION, 1992, 4 (02) :141-166
[5]   CONVENTIONAL MODELING OF THE MULTILAYER PERCEPTRON USING POLYNOMIAL BASIS FUNCTIONS [J].
CHEN, MS ;
MANRY, MT .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1993, 4 (01) :164-166
[6]  
CHEN MS, 1990, P IJCNN, V1, P643
[7]  
CHEN MS, 1991, P INT JOINT C NEUR N, pA899
[8]  
Dawson M. S., 1993, Remote Sensing Reviews, V7, P1, DOI 10.1080/02757259309532163
[9]  
DAWSON MS, 1992, P INT GEOSC REM SENS, V2, P90
[10]  
FLETCHER R, 1972, NUMERICAL METHODS UN