Linear-least-squares initialization of multilayer perceptrons through backpropagation of the desired response

被引:47
作者
Erdogmus, D [1 ]
Fontenla-Romero, O
Principe, JC
Alonso-Betanzos, A
Castillo, E
机构
[1] Oregon Hlth & Sci Univ, Oregon Grad Inst, Dept Comp Sci & Engn, Portland, OR 97006 USA
[2] Univ A Coruna, Dept Comp Sci, Lab Invest & Desarrollo Inteligen Artifical, La Coruna 15071, Spain
[3] Univ Florida, Computat Neuroengn Lab, Dept Elect & Comp Engn, Gainesville, FL 32611 USA
[4] Univ Cantabria, Dept Appl Math & Comp Sci, E-39005 Santander, Spain
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2005年 / 16卷 / 02期
基金
美国国家科学基金会;
关键词
approximate least-squares training of multilayer; perceptrons (MLPs); backpropagation (BP) of desired response; neural network initialization;
D O I
10.1109/TNN.2004.841777
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Training multilayer neural networks is typically carried out using descent techniques such as the gradient-based backpropagation (BP) of error or the quasi-Newton approaches including the Levenberg-Marquardt algorithm. This is basically due to the fact that there are no analytical methods to find the optimal weights, so iterative local or global optimization techniques are necessary. The success of iterative optimization procedures is strictly dependent on the initial conditions, therefore, in this paper, we devise a principled novel method of backpropagating the desired response through the layers of a multilayer perceptron (MLP), which enables us to accurately initialize these neural networks in,the minimum mean-square-error sense, using the analytic linear least squares solution. The generated solution can be used as an initial condition to standard iterative optimization algorithms. However, simulations demonstrate that in most cases, the performance achieved through the proposed initialization scheme leaves little room for further improvement in the mean-square-error (MSE) over the training set. In addition, the performance of the network optimized with the proposed approach also generalizes well to testing data. A rigorous derivation of the initialization algorithm is presented and its high performance is verified with a number of benchmark training problems including chaotic time-series prediction, classification, and nonlinear system identification with MLPs.
引用
收藏
页码:325 / 337
页数:13
相关论文
共 38 条
[1]
Natural gradient works efficiently in learning [J].
Amari, S .
NEURAL COMPUTATION, 1998, 10 (02) :251-276
[2]
OPTIMIZATION FOR TRAINING NEURAL NETS [J].
BARNARD, E .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02) :232-240
[4]
Bengio S., 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence (Cat. No.94TH0650-2), P324, DOI 10.1109/ICEC.1994.349932
[5]
A LEARNING ALGORITHM FOR MULTILAYERED NEURAL NETWORKS BASED ON LINEAR LEAST-SQUARES PROBLEMS [J].
BIEGLERKONIG, F ;
BARMANN, F .
NEURAL NETWORKS, 1993, 6 (01) :127-131
[6]
EXACT CALCULATION OF THE HESSIAN MATRIX FOR THE MULTILAYER PERCEPTRON [J].
BISHOP, C .
NEURAL COMPUTATION, 1992, 4 (04) :494-501
[7]
COMPUTING 2ND DERIVATIVES IN FEEDFORWARD NETWORKS - A REVIEW [J].
BUNTINE, WL ;
WEIGEND, AS .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (03) :480-488
[8]
A global optimum approach for one-layer neural networks [J].
Castillo, E ;
Fontenla-Romero, O ;
Guijarro-Berdiñas, B ;
Alonso-Betanzos, A .
NEURAL COMPUTATION, 2002, 14 (06) :1429-1449
[9]
Training multilayer neural networks using fast global learning algorithm - least-squares and penalized optimization methods [J].
Cho, SY ;
Chow, TWS .
NEUROCOMPUTING, 1999, 25 (1-3) :115-131
[10]
COLLA V, 1999, P EUR S ART NEUR NET, P363