Instrumental vigour in punishment and reward

被引:55
作者
Dayan, Peter [1 ]
机构
[1] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
关键词
dopamine; reinforcement learning; safety; serotonin; two-factor theory; DYNAMIC BEHAVIORAL-CHANGES; NUCLEUS-ACCUMBENS DOPAMINE; TEMPORAL DIFFERENCE MODELS; RAPHE SEROTONIN NEURONS; BASAL GANGLIA; REINFORCEMENT; PREDICTION; DORSAL; MODULATION; ACTIVATION;
D O I
10.1111/j.1460-9568.2012.08026.x
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Recent notions about the vigour of responding in operant conditioning suggest that the long-run average rate of reward should control the alacrity of action in cases in which the actual cost of speed is balanced against the opportunity cost of sloth. The average reward rate is suggested as being reported by tonic activity in the dopamine system and thereby influencing all actions, including ones that do not themselves lead directly to the rewards. This idea is syntactically problematical for the case of punishment. Here, we broaden the scope of the original suggestion, providing a two-factor analysis of obviated punishment in a variety of operant circumstances. We also consider the effects of stochastically successful actions, which turn out to differ rather markedly between appetitive and aversive cases. Finally, we study how to fit these ideas into nascent treatments that extend concepts of opponency between dopamine and serotonin from valence to invigoration.
引用
收藏
页码:1152 / 1168
页数:17
相关论文
共 122 条
[91]  
Reynolds SM, 2002, J NEUROSCI, V22, P7308
[92]   Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards [J].
Roesch, Matthew R. ;
Calu, Donna J. ;
Schoenbaum, Geoffrey .
NATURE NEUROSCIENCE, 2007, 10 (12) :1615-1624
[93]  
Rummery G., 1994, 166 CUEDFINENGTR
[94]   Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine [J].
Salamone, JD ;
Correa, M .
BEHAVIOURAL BRAIN RESEARCH, 2002, 137 (1-2) :3-25
[95]   Multiple representations of belief states and action values in corticobasal ganglia loops [J].
Samejima, Kazuyuki ;
Doya, Kenji .
REWARD AND DECISION MAKING IN CORTICOBASAL GANGLIA NETWORKS, 2007, 1104 :213-228
[96]   A neural substrate of prediction and reward [J].
Schultz, W ;
Dayan, P ;
Montague, PR .
SCIENCE, 1997, 275 (5306) :1593-1599
[97]   Neuronal coding of prediction errors [J].
Schultz, W ;
Dickinson, A .
ANNUAL REVIEW OF NEUROSCIENCE, 2000, 23 :473-500
[98]   Low-serotonin levels increase delayed reward discounting in humans [J].
Schweighofer, Nicolas ;
Bertin, Mathieu ;
Shishida, Kazuhiro ;
Okamoto, Yasumasa ;
Tanaka, Saori C. ;
Yamawaki, Shigeto ;
Doya, Kenji .
JOURNAL OF NEUROSCIENCE, 2008, 28 (17) :4528-4532
[99]   PHASIC RESPONSES IN DORSAL RAPHE SEROTONIN NEURONS TO NOXIOUS STIMULI [J].
Schweimer, J. V. ;
Ungless, M. A. .
NEUROSCIENCE, 2010, 171 (04) :1209-1215
[100]   Temporal difference models describe higher-order learning in humans [J].
Seymour, B ;
O'Doherty, JP ;
Dayan, P ;
Koltzenburg, M ;
Jones, AK ;
Dolan, RJ ;
Friston, KJ ;
Frackowiak, RS .
NATURE, 2004, 429 (6992) :664-667