Predictive reward signal of dopamine neurons

被引:3049
作者
Schultz, W [1 ]
机构
[1] Univ Fribourg, Inst Physiol, CH-1700 Fribourg, Switzerland
[2] Univ Fribourg, Program Neurosci, CH-1700 Fribourg, Switzerland
关键词
D O I
10.1152/jn.1998.80.1.1
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
The effects of lesions, receptor blocking, electrical self-stimulation, and drugs of abuse suggest that midbrain dopamine systems are involved in processing reward information and learning approach behavior. Most dopamine neurons show phasic activations after primary liquid and food rewards and conditioned, reward-predicting visual and auditory stimuli. They show biphasic, activation-depression responses after stimuli that resemble reward-predicting stimuli or are novel or particularly salient. However, only few phasic activations follow aversive stimuli. Thus dopamine neurons label environmental stimuli with appetitive value, predict and detect rewards and signal alerting and motivating events. By failing to discriminate between different rewards, dopamine neurons appear to emit an alerting message about the surprising presence or absence of rewards. All responses to rewards and reward-predicting stimuli depend on event predictability. Dopamine neurons are activated by rewarding events that are better than predicted, remain uninfluenced by events that are as good as predicted, and are depressed by events that are worse than predicted. By signaling rewards according to a prediction error, dopamine responses have the formal characteristics of a teaching signal postulated by reinforcement learning theories. Dopamine responses transfer during learning from primary rewards to reward-predicting stimuli. This may contribute to neuronal mechanisms underlying the retrograde action of rewards, one of the main puzzles in reinforcement learning. The impulse response releases a short pulse of dopamine onto many dendrites, thus broadcasting a rather global reinforcement signal to postsynaptic neurons. This signal may improve approach behavior by providing advance reward information before the behavior occurs, and may contribute to learning by modifying synaptic transmission. The dopamine reward signal is supplemented by activity in neurons in striatum, frontal cortex, and amygdala, which process specific reward information but do not emit a global reward prediction error signal. A cooperation between the different reward signals may assure the use of specific rewards for selectively reinforcing behaviors. Among the other projection systems, noradrenaline neurons predominantly serve attentional mechanisms and nucleus basalis neurons code rewards heterogeneously. Cerebellar climbing fibers signal errors in motor performance or errors in the prediction of aversive events to cerebellar Purkinje cells. Most deficits following dopamine-depleting lesions are not easily explained by a defective reward signal but may reflect the absence of a general enabling function of tonic levels of extracellular dopamine. Thus dopamine systems may have two functions, the phasic transmission of reward information and the tonic enabling of postsynaptic neurons.
引用
收藏
页码:1 / 27
页数:27
相关论文
共 303 条
[81]   PREDICTIVE CONTROL OF EYE-MOVEMENTS IN PARKINSON DISEASE [J].
FLOWERS, KA ;
DOWNING, AC .
ANNALS OF NEUROLOGY, 1978, 4 (01) :63-66
[82]   IMPULSE ACTIVITY OF LOCUS COERULEUS NEURONS IN AWAKE RATS AND MONKEYS IS A FUNCTION OF SENSORY STIMULATION AND AROUSAL [J].
FOOTE, SL ;
ASTONJONES, G ;
BLOOM, FE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1980, 77 (05) :3033-3037
[83]   PIMOZIDE-INDUCED EXTINCTION OF INTRACRANIAL SELF-STIMULATION - RESPONSE PATTERNS RULE OUT MOTOR OR PERFORMANCE DEFICITS [J].
FOURIEZOS, G ;
WISE, RA .
BRAIN RESEARCH, 1976, 103 (02) :377-380
[84]   TYROSINE HYDROXYLASE IMMUNOREACTIVE BOUTONS IN SYNAPTIC CONTACT WITH IDENTIFIED STRIATONIGRAL NEURONS, WITH PARTICULAR REFERENCE TO DENDRITIC SPINES [J].
FREUND, TF ;
POWELL, JF ;
SMITH, AD .
NEUROSCIENCE, 1984, 13 (04) :1189-1215
[85]   DOPAMINERGIC ANTAGONISTS PREVENT LONG-TERM MAINTENANCE OF POSTTETANIC LTP IN THE CA1 REGION OF RAT HIPPOCAMPAL SLICES [J].
FREY, U ;
SCHROEDER, H ;
MATTHIES, H .
BRAIN RESEARCH, 1990, 522 (01) :69-75
[86]   VALUE-DEPENDENT SELECTION IN THE BRAIN - SIMULATION IN A SYNTHETIC NEURAL MODEL [J].
FRISTON, KJ ;
TONONI, G ;
REEKE, GN ;
SPORNS, O ;
EDELMAN, GM .
NEUROSCIENCE, 1994, 59 (02) :229-243
[87]   SPECIES RECOGNITION BY 5 MACAQUE MONKEYS [J].
FUJITA, K .
PRIMATES, 1987, 28 (03) :353-366
[88]   GLUTAMATERGIC AND CHOLINERGIC INPUTS FROM THE PEDUNCULOPONTINE TEGMENTAL NUCLEUS TO DOPAMINE NEURONS IN THE SUBSTANTIA-NIGRA PARS COMPACTA [J].
FUTAMI, T ;
TAKAKUSAKI, K ;
KITAI, ST .
NEUROSCIENCE RESEARCH, 1995, 21 (04) :331-342
[89]  
Gallistel C. R., 1990, ORG LEARNING
[90]   MODEL PREDICTIVE CONTROL - THEORY AND PRACTICE - A SURVEY [J].
GARCIA, CE ;
PRETT, DM ;
MORARI, M .
AUTOMATICA, 1989, 25 (03) :335-348