A recurrent neural network that learns to count

被引:150
作者
Rodriguez, P [1 ]
Wiles, J
Elman, JL
机构
[1] Univ Calif San Diego, Dept Cognit Sci, La Jolla, CA 92093 USA
[2] Univ Queensland, Dept Comp Sci, St Lucia, Qld 4072, Australia
[3] Univ Queensland, Dept Psychol, St Lucia, Qld 4072, Australia
关键词
recurrent neural network; dynamical systems; context-free languages;
D O I
10.1080/095400999116340
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parallel distributed processing (PDP) architectures demonstrate a potentially radical alternative to the traditional theories of language processing that are based on serial computational models. However, learning complex structural relationships in temporal data presents a serious challenge to PDP systems. For example, automata theory dictates that processing strings from a context-free language (CFL) requires a stack or counter memory device. While some PDP models have been hand-crafted to emulate such a device, it is not clear how a neural network might develop such a device when learning a CFL. This research employs standard backpropagation training techniques for a recurrent neural network (RNN) in the task of learning to predict the next character in a simple deterministic CFL (DCFL). We show that an RNN can learn to recognize the structure of a simple DCFL. We use dynamical systems theory to identify how network states reflect that structure by building counters in phase space. The work is an empirical investigation which is complementary to theoretical analyses of network capabilities, yet original in its specific configuration of dynamics involved. The application of dynamical systems theory helps us relate the simulation results to theoretical results, and the learning task enables us to highlight some issues for understanding dynamical systems that process language with counters.
引用
收藏
页码:5 / 40
页数:36
相关论文
共 43 条
[1]  
[Anonymous], PROCEEDINGS OF THE S
[2]  
Barton G.E., 1987, Computational Complexity and Natural Language
[3]  
Batali J., 1994, Artificial Life IV. Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems, P160
[4]   Analysis of dynamical recognizers [J].
Blair, AD ;
Pollack, JB .
NEURAL COMPUTATION, 1997, 9 (05) :1127-1142
[5]  
BLANK DS, 1992, COG SCI SER, P113
[6]   The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction [J].
Casey, M .
NEURAL COMPUTATION, 1996, 8 (06) :1135-1178
[7]  
CHRISTIANSEN MH, 1994, ARCH PHILOSOPHY
[8]  
Cottrell G. W., 1993, Connection Science, V5, P37, DOI 10.1080/09540099308915684
[9]  
DAS S, 1992, PROCEEDINGS OF THE FOURTEENTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, P791
[10]   DISTRIBUTED REPRESENTATIONS, SIMPLE RECURRENT NETWORKS, AND GRAMMATICAL STRUCTURE [J].
ELMAN, JL .
MACHINE LEARNING, 1991, 7 (2-3) :195-225