Midbrain dopamine neurons encode a quantitative reward prediction error signal

被引:857
作者
Bayer, HM
Glimcher, PW [1 ]
机构
[1] NYU, Ctr Neural Sci, New York, NY 10003 USA
[2] Columbia Univ, Ctr Decis Sci, New York, NY 10027 USA
关键词
D O I
10.1016/j.neuron.2005.05.020
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
The midbrain dopamine neurons are hypothesized to provide a physiological correlate of the reward prediction error signal required by current models of reinforcement learning. We examined the activity of single dopamine neurons during a task in which subjects learned by trial and error when to make an eye movement for a juice reward. We found that these neurons encoded the difference between the current reward and a weighted average of previous rewards, a reward prediction error, but only for outcomes that were better than expected. Thus, the firing rate of midbrain dopamine neurons is quantitatively predicted by theoretical descriptions of the reward prediction error signal used in reinforcement learning models for circumstances in which this signal has a positive value. We also found that the dopamine system continued to compute the reward prediction error even when the behavioral policy of the animal was only weakly influenced by this computation.
引用
收藏
页码:129 / 141
页数:13
相关论文
共 30 条
[1]   Effect of central 5-hydroxytryptamine depletion on changeover behaviour in concurrent schedules of reinforcement [J].
Al-Ruwaitea, ASA ;
Chiang, TJ ;
Ho, MY ;
Bradshaw, CM ;
Szabadi, E .
PSYCHOPHARMACOLOGY, 1999, 144 (03) :264-271
[2]  
Bush RR, 1955, Stochastic models for learning, DOI DOI 10.1037/14496-000
[3]   Enhanced or impaired cognitive function in Parkinson's disease as a function of dopaminergic medication and task demands [J].
Cools, R ;
Barker, RA ;
Sahakian, BJ ;
Robbins, TW .
CEREBRAL CORTEX, 2001, 11 (12) :1136-1143
[4]   Opponent interactions between serotonin and dopamine [J].
Daw, ND ;
Kakade, S ;
Dayan, P .
NEURAL NETWORKS, 2002, 15 (4-6) :603-616
[5]  
DEAKIN JFW, 1983, J PSYCHOPHARMACOL, V43, P563
[6]   Discrete coding of reward probability and uncertainty by dopamine neurons [J].
Fiorillo, CD ;
Tobler, PN ;
Schultz, W .
SCIENCE, 2003, 299 (5614) :1898-1902
[7]   By carrot or by stick: Cognitive reinforcement learning in Parkinsonism [J].
Frank, MJ ;
Seeberger, LC ;
O'Reilly, RC .
SCIENCE, 2004, 306 (5703) :1940-1943
[8]   Time, rate, and conditioning [J].
Gallistel, CR ;
Gibbon, J .
PSYCHOLOGICAL REVIEW, 2000, 107 (02) :289-344
[9]   Application of neurosonography to experimental physiology [J].
Glimcher, PW ;
Ciaramitaro, VM ;
Platt, ML ;
Bayer, HM ;
Brown, MA ;
Handel, A .
JOURNAL OF NEUROSCIENCE METHODS, 2001, 108 (02) :131-144
[10]   Response properties of saccade-related burst neurons in the central mesencephalic reticular formation [J].
Handel, A ;
Glimcher, PW .
JOURNAL OF NEUROPHYSIOLOGY, 1997, 78 (04) :2164-2175