Dopamine: generalization and bonuses

被引:309
作者
Kakade, S [1 ]
Dayan, P [1 ]
机构
[1] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
基金
美国国家科学基金会;
关键词
dopamine; reinforcement learning; exploration; temporal difference; generalization;
D O I
10.1016/S0893-6080(02)00048-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the temporal difference model of primate dopamine neurons, their phasic activity reports a prediction error for future reward. This model is supported by a wealth of experimental data. However, in certain circumstances, the activity of the dopamine cells seems anomalous under the model, as they respond in particular ways to stimuli that are not obviously related to predictions of reward. In this paper, we address two important sets of anomalies, those having to do with generalization and novelty. Generalization responses are treated as the natural consequence of partial information; novelty responses are treated by the suggestion that dopamine cells multiplex information about reward bonuses, including exploration bonuses and shaping bonuses. We interpret this additional role for dopamine in terms of the mechanistic attentional and psychomotor effects of dopamine, having the computational role of guiding exploration. (C) 2002 Published by Elsevier Science Ltd.
引用
收藏
页码:549 / 559
页数:11
相关论文
共 67 条
[1]   Psychobiology of novelty seeking and drug seeking behavior [J].
Bardo, MT ;
Donohew, RL ;
Harrington, NG .
BEHAVIOURAL BRAIN RESEARCH, 1996, 77 (1-2) :23-43
[2]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[3]  
Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[4]  
BRAFMAN RI, 2001, IJCAI, P953
[5]   Cognition and control in schizophrenia: A computational model of dopamine and prefrontal function [J].
Braver, TS ;
Barch, DM ;
Cohen, JD .
BIOLOGICAL PSYCHIATRY, 1999, 46 (03) :312-328
[6]   THE MISBEHAVIOR OF ORGANISMS [J].
BRELAND, K ;
BRELAND, M .
AMERICAN PSYCHOLOGIST, 1961, 16 (11) :681-684
[7]   PROPERTIES OF THE INTERNAL CLOCK [J].
CHURCH, RM .
ANNALS OF THE NEW YORK ACADEMY OF SCIENCES, 1984, 423 (MAY) :566-582
[8]  
Cohen J. D., 1998, PREFRONTAL CORTEX EX
[9]   Opponent interactions between serotonin and dopamine [J].
Daw, ND ;
Kakade, S ;
Dayan, P .
NEURAL NETWORKS, 2002, 15 (4-6) :603-616
[10]   Behavioral considerations suggest an average reward TD model of the dopamine system [J].
Daw, ND ;
Touretzky, DS .
NEUROCOMPUTING, 2000, 32 :679-684