Sufficient conditions for error backflow convergence in dynamical recurrent neural networks

被引：5

作者：

Aussem, A ^{[1
]}

机构：

[1] Univ Clermont Ferrand, LIMOS, CNRS, FRE 2239, F-63173 Aubiere, France

来源：

NEURAL COMPUTATION | 2002年 / 14卷 / 08期

关键词：

D O I：

10.1162/089976602760128063

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article extends previous analysis of the gradient decay to a class of discrete-time fully recurrent networks, called dynamical recurrent neural networks, obtained by modeling synapses as finite impulse response (FIR) filters instead of multiplicative scalars. Using elementary matrix manipulations, we provide an upper bound on the norm of the weight matrix, ensuring that the gradient vector, when propagated in a reverse manner in time through the error-propagation network, decays exponentially to zero. This bound applies to all recurrent FIR architecture proposals, as well as fixed-point recurrent networks, regardless of delay and connectivity. In addition, we show that the computational overhead of the learning algorithm can be reduced drastically by taking advantage of the exponential decay of the gradient.

引用

页码：1907 / 1927

页数：21

共 25 条

[1] DYNAMICAL RECURRENT NEURAL NETWORKS - TOWARDS ENVIRONMENTAL TIME-SERIES PREDICTION [J].

AUSSEM, A ;

MURTAGH, F ;

SARAZIN, M .

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 1995, 6 (02) :145-170

[2]

Aussem A., 1997, Connection Science, V9, P113, DOI 10.1080/095400997116766

[3] Dynamical recurrent neural networks towards prediction and modeling of dynamical systems [J].

Aussem, A .

NEUROCOMPUTING, 1999, 28 :207-232

[4] Sufficient conditions for error back flow convergence in dynamical recurrent neural networks [J].

Aussem, A .

IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL IV, 2000, :577-582

[5] GRADIENT DESCENT LEARNING ALGORITHM OVERVIEW - A GENERAL DYNAMICAL-SYSTEMS PERSPECTIVE [J].

BALDI, P .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1995, 6 (01) :182-195

[6] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].

BENGIO, Y ;

SIMARD, P ;

FRASCONI, P .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166

[7] A bounded exploration approach to constructive algorithms for recurrent neural networks [J].

Boné, R ;

Crucianu, M ;

Verley, G ;

de Beauville, JPA .

IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL III, 2000, :27-32

[8] On-line learning algorithms for locally recurrent neural networks [J].

Campolucci, P ;

Uncini, A ;

Piazza, F ;

Rao, BD .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (02) :253-271

[9] Efficient training of recurrent neural network with time delays [J].

Cohen, B ;

Saad, D ;

Marom, E .

NEURAL NETWORKS, 1997, 10 (01) :51-59

[10] Discrete-time backpropagation for training synaptic delay-based artificial neural networks [J].

Duro, RJ ;

Reyes, JS .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (04) :779-789

← 1 2 3 →