Learning of sequential movements by neural network model with dopamine-like reinforcement signal

被引:139
作者
Suri, RE
Schultz, W [1 ]
机构
[1] Univ Fribourg, Inst Physiol, CH-1700 Fribourg, Switzerland
[2] Univ So Calif, Brain Project, Los Angeles, CA 90089 USA
关键词
basal ganglia; teaching signal; temporal difference; synaptic plasticity; eligibility;
D O I
10.1007/s002210050467
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Dopamine neurons appear to code an error in the prediction of reward. They are activated by unpredicted rewards, are not influenced by predicted rewards, and are depressed when a predicted reward is omitted. After conditioning, they respond to reward-predicting stimuli in a similar manner. With these characteristics, the dopamine response strongly resembles the predictive reinforcement teaching signal of neural network models implementing the temporal difference learning algorithm. This study explored a neural network model that used a reward-prediction error signal strongly resembling dopamine responses for learning movement sequences. A different stimulus was presented in each step of the sequence and required a different movement reaction, and reward occurred at the end of the correctly performed sequence. The dopamine-like predictive reinforcement signal efficiently allowed the model to learn long sequences. By contrast, learning with an unconditional reinforcement signal required synaptic eligibility traces of longer and biologically less-plausible durations for obtaining satisfactory performance. Thus, dopamine-like neuronal signals constitute excellent teaching signals for learning sequential behavior.
引用
收藏
页码:350 / 354
页数:5
相关论文
共 31 条
[1]  
[Anonymous], MODELS INFORM PROCES
[2]  
[Anonymous], 1995, MODELS INFORM PROCES
[3]  
APICELLA P, 1991, EXP BRAIN RES, V85, P491
[4]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[5]  
Barto AG., 1995, Models of information processing in the basal ganglia, P215
[6]   DISTURBANCE OF SEQUENTIAL MOVEMENTS IN PATIENTS WITH PARKINSONS-DISEASE [J].
BENECKE, R ;
ROTHWELL, JC ;
DICK, JPR ;
DAY, BL ;
MARSDEN, CD .
BRAIN, 1987, 110 :361-379
[7]   LONG-TERM POTENTIATION IN THE STRIATUM IS UNMASKED BY REMOVING THE VOLTAGE-DEPENDENT MAGNESIUM BLOCK OF NMDA RECEPTOR CHANNELS [J].
CALABRESI, P ;
PISANI, A ;
MERCURI, NB ;
BERNARDI, G .
EUROPEAN JOURNAL OF NEUROSCIENCE, 1992, 4 (10) :929-935
[8]  
Calabresi P, 1997, J NEUROSCI, V17, P4536
[9]  
Dickinson A., 1980, CONT ANIMAL LEARNING
[10]   A MODEL OF CORTICOSTRIATAL PLASTICITY FOR LEARNING OCULOMOTOR ASSOCIATIONS AND SEQUENCES [J].
DOMINEY, P ;
ARBIB, M ;
JOSEPH, JP .
JOURNAL OF COGNITIVE NEUROSCIENCE, 1995, 7 (03) :311-336