Instrumental vigour in punishment and reward

被引:55
作者
Dayan, Peter [1 ]
机构
[1] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
关键词
dopamine; reinforcement learning; safety; serotonin; two-factor theory; DYNAMIC BEHAVIORAL-CHANGES; NUCLEUS-ACCUMBENS DOPAMINE; TEMPORAL DIFFERENCE MODELS; RAPHE SEROTONIN NEURONS; BASAL GANGLIA; REINFORCEMENT; PREDICTION; DORSAL; MODULATION; ACTIVATION;
D O I
10.1111/j.1460-9568.2012.08026.x
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Recent notions about the vigour of responding in operant conditioning suggest that the long-run average rate of reward should control the alacrity of action in cases in which the actual cost of speed is balanced against the opportunity cost of sloth. The average reward rate is suggested as being reported by tonic activity in the dopamine system and thereby influencing all actions, including ones that do not themselves lead directly to the rewards. This idea is syntactically problematical for the case of punishment. Here, we broaden the scope of the original suggestion, providing a two-factor analysis of obviated punishment in a variety of operant circumstances. We also consider the effects of stochastically successful actions, which turn out to differ rather markedly between appetitive and aversive cases. Finally, we study how to fit these ideas into nascent treatments that extend concepts of opponency between dopamine and serotonin from valence to invigoration.
引用
收藏
页码:1152 / 1168
页数:17
相关论文
共 122 条
[21]   The computational neurobiology of learning and reward [J].
Daw, ND ;
Doya, K .
CURRENT OPINION IN NEUROBIOLOGY, 2006, 16 (02) :199-204
[22]   Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control [J].
Daw, ND ;
Niv, Y ;
Dayan, P .
NATURE NEUROSCIENCE, 2005, 8 (12) :1704-1711
[23]   Opponent interactions between serotonin and dopamine [J].
Daw, ND ;
Kakade, S ;
Dayan, P .
NEURAL NETWORKS, 2002, 15 (4-6) :603-616
[24]   Serotonin, inhibition, and negative mood [J].
Dayan, Peter ;
Huys, Quentin J. M. .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (02)
[25]   The misbehavior of value and the discipline of the will [J].
Dayan, Peter ;
Niv, Yael ;
Seymour, Ben ;
Daw, Nathaniel D. .
NEURAL NETWORKS, 2006, 19 (08) :1153-1160
[26]   Serotonin in Affective Control [J].
Dayan, Peter ;
Huys, Quentin J. M. .
ANNUAL REVIEW OF NEUROSCIENCE, 2009, 32 :95-126
[27]  
DEAKIN J F W, 1991, Journal of Psychopharmacology, V5, P305, DOI 10.1177/026988119100500414
[28]  
Deakin J.F. W., 1983, J PSYCHOPHARMACOL, V43, P563
[29]   The role of the striatum in aversive learning and aversive prediction errors [J].
Delgado, Mauricio R. ;
Li, Jian ;
Schiller, Daniela ;
Phelps, Elizabeth A. .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2008, 363 (1511) :3787-3800
[30]   Dissociation of pavlovian and instrumental incentive learning under dopamine antagonists [J].
Dickinson, A ;
Smith, J ;
Mirenowicz, J .
BEHAVIORAL NEUROSCIENCE, 2000, 114 (03) :468-483