GRADED STATE MACHINES - THE REPRESENTATION OF TEMPORAL CONTINGENCIES IN SIMPLE RECURRENT NETWORKS

被引:94
作者
SERVANSCHREIBER, D [1 ]
CLEEREMANS, A [1 ]
MCCLELLAND, JL [1 ]
机构
[1] CARNEGIE MELLON UNIV,DEPT PSYCHOL,PITTSBURGH,PA 15213
关键词
GRADED STATE MACHINES; FINITE STATE AUTOMATA; RECURRENT NETWORKS; TEMPORAL CONTINGENCIES; PREDICTION TASK;
D O I
10.1007/BF00114843
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore a network architecture introduced by Elman (1990) for predicting successive elements of a sequence. The network uses the pattern of activation over a set of hidden units from time-step t-1, together with element t, to predict element t + 1. When the network is trained with strings from a particular finite-state grammar, it can learn to be a perfect finite-state recognizer for the grammar. When the net has a minimal number of hidden units, patterns on the hidden units come to correspond to the nodes of the grammar; however, this correspondence is not necessary for the network to act as a perfect finite-state recognizer. Next, we provide a detailed analysis of how the network acquires its internal representations. We show that the network progressively encodes more and more temporal context by means of a probability analysis. Finally, we explore the conditions under which the network can carry information about distant sequential contingencies across intervening elements to distant elements. Such information is maintained with relative ease if it is relevant at each intermediate step; it tends to be lost when intervening elements do not depend on it. At first glance this may suggest that such networks are not relevant to natural language, in which dependencies may span indefinite distances. However, embeddings in natural language are not completely independent of earlier information. The final simulation shows that long distance sequential contingencies can be encoded by the network even if only subtle statistical properties of embedded strings depend on the early information. The network encodes long-distance dependencies by shading internal representations that are responsible for processing common embeddings in otherwise different sequences. This ability to represent simultaneously similarities and differences between several sequences relies on the graded nature of representations used by the network, which contrast with the finite states of traditional automata. For this reason, the network and other similar architectures may be called Graded State Machines.
引用
收藏
页码:161 / 193
页数:33
相关论文
共 20 条
[1]  
ALLEN R, 1990, TRAR90402 BELL COMM
[2]  
ALLEN RB, 1989, CONNECTIONISM PERSPE
[3]  
ALLEN RB, 1988, 10TH P ANN C COGN SC
[4]   Finite State Automata and Simple Recurrent Networks [J].
Cleeremans, Axel ;
Servan-Schreiber, David ;
McClelland, James L. .
NEURAL COMPUTATION, 1989, 1 (03) :372-381
[5]  
COTTRELL GW, 1985, 7TH P ANN C COGN SCI
[6]  
ELMAN J, 1990, COGNITIVE MODELS SPE
[7]   FINDING STRUCTURE IN TIME [J].
ELMAN, JL .
COGNITIVE SCIENCE, 1990, 14 (02) :179-211
[8]  
FANTY M, 1985, TR174 U ROCH COMP SC
[9]  
HANSON S, 1987, 9TH P ANN C COGN SCI
[10]  
JORDAN MI, 1986, 8TH P ANN C COGN SCI