Instrumental vigour in punishment and reward

被引:55
作者
Dayan, Peter [1 ]
机构
[1] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
关键词
dopamine; reinforcement learning; safety; serotonin; two-factor theory; DYNAMIC BEHAVIORAL-CHANGES; NUCLEUS-ACCUMBENS DOPAMINE; TEMPORAL DIFFERENCE MODELS; RAPHE SEROTONIN NEURONS; BASAL GANGLIA; REINFORCEMENT; PREDICTION; DORSAL; MODULATION; ACTIVATION;
D O I
10.1111/j.1460-9568.2012.08026.x
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Recent notions about the vigour of responding in operant conditioning suggest that the long-run average rate of reward should control the alacrity of action in cases in which the actual cost of speed is balanced against the opportunity cost of sloth. The average reward rate is suggested as being reported by tonic activity in the dopamine system and thereby influencing all actions, including ones that do not themselves lead directly to the rewards. This idea is syntactically problematical for the case of punishment. Here, we broaden the scope of the original suggestion, providing a two-factor analysis of obviated punishment in a variety of operant circumstances. We also consider the effects of stochastically successful actions, which turn out to differ rather markedly between appetitive and aversive cases. Finally, we study how to fit these ideas into nascent treatments that extend concepts of opponency between dopamine and serotonin from valence to invigoration.
引用
收藏
页码:1152 / 1168
页数:17
相关论文
共 122 条
[41]   EFFECTS OF COMBINED OR SEPARATE 5,7-DIHYDROXYTRYPTAMINE LESIONS OF THE DORSAL AND MEDIAN RAPHE NUCLEI ON RESPONDING MAINTAINED BY A DRL 20S SCHEDULE OF FOOD REINFORCEMENT [J].
FLETCHER, PJ .
BRAIN RESEARCH, 1995, 675 (1-2) :45-54
[42]   GENETIC CONTRIBUTIONS TO AVOIDANCE-BASED DECISIONS: STRIATAL D2 RECEPTOR POLYMORPHISMS [J].
Frank, M. J. ;
Hutchison, K. .
NEUROSCIENCE, 2009, 164 (01) :131-140
[43]   Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning [J].
Frank, Michael J. ;
Moustafa, Ahmed A. ;
Haughey, Heather M. ;
Curran, Tim ;
Hutchison, Kent E. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (41) :16311-16316
[44]   By carrot or by stick: Cognitive reinforcement learning in Parkinsonism [J].
Frank, MJ ;
Seeberger, LC ;
O'Reilly, RC .
SCIENCE, 2004, 306 (5703) :1940-1943
[45]   Dynamic dopamine modulation in the basal ganglia: A neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism [J].
Frank, MJ .
JOURNAL OF COGNITIVE NEUROSCIENCE, 2005, 17 (01) :51-72
[46]   The learning curve: Implications of a quantitative analysis [J].
Gallistel, CR ;
Fairhurst, S ;
Balsam, P .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (36) :13124-13131
[47]   SCALAR EXPECTANCY-THEORY AND WEBERS LAW IN ANIMAL TIMING [J].
GIBBON, J .
PSYCHOLOGICAL REVIEW, 1977, 84 (03) :279-325
[48]   States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning [J].
Glaescher, Jan ;
Daw, Nathaniel ;
Dayan, Peter ;
O'Doherty, John P. .
NEURON, 2010, 66 (04) :585-595
[49]   Dopaminergic modulation of limbic and cortical drive of nucleus accumbens in goal-directed behavior [J].
Goto, Y ;
Grace, AA .
NATURE NEUROSCIENCE, 2005, 8 (06) :805-812
[50]   The Yin and Yang of dopamine release: a new perspective [J].
Goto, Yukiori ;
Otani, Satoru ;
Grace, Anthony A. .
NEUROPHARMACOLOGY, 2007, 53 (05) :583-587