Dopamine neurons report an error in the temporal prediction of reward during learning

被引:791
作者
Hollerman, JR [1 ]
Schultz, W [1 ]
机构
[1] Univ Fribourg, Inst Physiol, CH-1700 Fribourg, Switzerland
关键词
D O I
10.1038/1124
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Many behaviors are affected by rewards, undergoing long-term changes when rewards are different than predicted but remaining unchanged when rewards occur exactly as predicted. The discrepancy between reward occurrence and reward prediction is termed an 'error in reward prediction'. Dopamine neurons in the substantia nigra and the ventral tegmental area are believed to be involved in reward-dependent behaviors. Consistent with this role, they are activated by rewards, and because they are activated more strongly by unpredicted than by predicted rewards they may play a role in learning. The present study investigated whether monkey dopamine neurons code an error in reward prediction during the course of learning. Dopamine neuron responses reflected the changes in reward prediction during individual learning episodes; dopamine neurons were activated by rewards during early trials, when errors were frequent and rewards unpredictable, but activation was progressively reduced as performance was consolidated and rewards became more predictable. These neurons were also activated when rewards occurred at unpredicted times and were depressed when rewards were omitted at the predicted times. Thus, dopamine neurons code errors in the prediction of both the occurrence and the time of rewards. In this respect, their responses resemble the teaching signals that have been employed in particularly efficient computational learning models.
引用
收藏
页码:304 / 309
页数:6
相关论文
共 47 条
[31]   A neural substrate of prediction and reward [J].
Schultz, W ;
Dayan, P ;
Montague, PR .
SCIENCE, 1997, 275 (5306) :1593-1599
[32]   DOPAMINE NEURONS OF THE MONKEY MIDBRAIN - CONTINGENCIES OF RESPONSES TO STIMULI ELICITING IMMEDIATE BEHAVIORAL REACTIONS [J].
SCHULTZ, W ;
ROMO, R .
JOURNAL OF NEUROPHYSIOLOGY, 1990, 63 (03) :607-624
[33]   NEURONAL-ACTIVITY IN MONKEY VENTRAL STRIATUM RELATED TO THE EXPECTATION OF REWARD [J].
SCHULTZ, W ;
APICELLA, P ;
SCARNATI, E ;
LJUNGBERG, T .
JOURNAL OF NEUROSCIENCE, 1992, 12 (12) :4595-4610
[34]  
SCHULTZ W, 1993, J NEUROSCI, V13, P900
[35]   CS-US INTERVAL AND US INTENSITY IN CLASSICAL CONDITIONING OF RABBITS NICTITATING MEMBRANE RESPONSE [J].
SMITH, MC .
JOURNAL OF COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 1968, 66 (3P1) :679-&
[36]   BEHAVIORAL-CORRELATES OF DOPAMINERGIC UNIT-ACTIVITY IN FREELY MOVING CATS [J].
STEINFELS, GF ;
HEYM, J ;
STRECKER, RE ;
JACOBS, BL .
BRAIN RESEARCH, 1983, 258 (02) :217-228
[37]  
SURI RE, IN PRESS EXP BRAIN R
[38]  
Sutton R. S., 1988, Machine Learning, V3, P9, DOI 10.1023/A:1022633531479
[39]   TOWARD A MODERN THEORY OF ADAPTIVE NETWORKS - EXPECTATION AND PREDICTION [J].
SUTTON, RS ;
BARTO, AG .
PSYCHOLOGICAL REVIEW, 1981, 88 (02) :135-170
[40]   TD-GAMMON, A SELF-TEACHING BACKGAMMON PROGRAM, ACHIEVES MASTER-LEVEL PLAY [J].
TESAURO, G .
NEURAL COMPUTATION, 1994, 6 (02) :215-219