Neural Basis of Reinforcement Learning and Decision Making

被引:313
作者
Lee, Daeyeol [1 ,2 ]
Seo, Hyojung [1 ]
Jung, Min Whan [3 ]
机构
[1] Yale Univ, Sch Med, Dept Neurobiol, Kavli Inst Neurosci, New Haven, CT 06510 USA
[2] Yale Univ, Dept Psychol, New Haven, CT 06520 USA
[3] Ajou Univ, Neurosci Lab, Inst Med Sci, Sch Med, Suwon 443721, South Korea
来源
ANNUAL REVIEW OF NEUROSCIENCE, VOL 35 | 2012年 / 35卷
关键词
prefrontal cortex; neuroeconomics; reward; striatum; uncertainty; LATERAL INTRAPARIETAL CORTEX; TEMPORALLY DISCOUNTED VALUES; POSTERIOR PARIETAL CORTEX; SACCADIC EYE-MOVEMENTS; ORBITOFRONTAL CORTEX; PREFRONTAL CORTEX; BASAL GANGLIA; REWARD SIGNALS; PREDICTION ERRORS; NEURONAL-ACTIVITY;
D O I
10.1146/annurev-neuro-062111-150512
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Reinforcement learning is an adaptive process in which an animal utilizes its previous experience to improve the outcomes of future choices. Computational theories of reinforcement learning play a central role in the newly emerging areas of neuroeconomics and decision neuroscience. In this framework, actions are chosen according to their value functions, which describe how much future reward is expected from each action. Value functions can be adjusted not only through reward and penalty, but also by the animal's knowledge of its current environment. Studies have revealed that a large proportion of the brain is involved in representing and updating value functions and using them to choose an action. However, how the nature of a behavioral task affects the neural mechanisms of reinforcement learning remains incompletely understood. Future studies should uncover the principles by which different computational elements of reinforcement learning are dynamically coordinated across the entire brain.
引用
收藏
页码:287 / 308
页数:22
相关论文
共 149 条
  • [1] Distributed Coding of Actual and Hypothetical Outcomes in the Orbital and Dorsolateral Prefrontal Cortex
    Abe, Hiroshi
    Lee, Daeyeol
    [J]. NEURON, 2011, 70 (04) : 731 - 741
  • [2] ANDERSEN RA, 1987, EXP BRAIN RES, V67, P316
  • [3] [Anonymous], 1968, INFORM THEORY CHOICE, DOI DOI 10.1002/BS.3830140408
  • [4] Encoding of Both Positive and Negative Reward Prediction Errors by Neurons of the Primate Lateral Prefrontal Cortex and Caudate Nucleus
    Asaad, Wael F.
    Eskandar, Emad N.
    [J]. JOURNAL OF NEUROSCIENCE, 2011, 31 (49) : 17772 - 17787
  • [5] Goal-directed instrumental action: contingency and incentive learning and their cortical substrates
    Balleine, BW
    Dickinson, A
    [J]. NEUROPHARMACOLOGY, 1998, 37 (4-5) : 407 - 419
  • [6] Prefrontal cortex and decision making in a mixed-strategy game
    Barraclough, DJ
    Conroy, ML
    Lee, D
    [J]. NATURE NEUROSCIENCE, 2004, 7 (04) : 404 - 410
  • [7] Probabilistic Population Codes for Bayesian Decision Making
    Beck, Jeffrey M.
    Ma, Wei Ji
    Kiani, Roozbeh
    Hanks, Tim
    Churchland, Anne K.
    Roitman, Jamie
    Shadlen, Michael N.
    Latham, Peter E.
    Pouget, Alexandre
    [J]. NEURON, 2008, 60 (06) : 1142 - 1152
  • [8] Associative learning of social value
    Behrens, Timothy E. J.
    Hunt, Laurence T.
    Woolrich, Mark W.
    Rushworth, Matthew F. S.
    [J]. NATURE, 2008, 456 (7219) : 245 - U45
  • [9] Learning the value of information in an uncertain world
    Behrens, Timothy E. J.
    Woolrich, Mark W.
    Walton, Mark E.
    Rushworth, Matthew F. S.
    [J]. NATURE NEUROSCIENCE, 2007, 10 (09) : 1214 - 1221
  • [10] Moment-to-moment tracking of state value in the amygdala
    Belova, Marina A.
    Paton, Joseph J.
    Salzman, C. Daniel
    [J]. JOURNAL OF NEUROSCIENCE, 2008, 28 (40) : 10023 - 10030