Decision theory, reinforcement learning, and the brain

被引:360
作者
Dayan, Peter [1 ]
Daw, Nathaniel D. [2 ]
机构
[1] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
[2] NYU, New York, NY USA
关键词
D O I
10.3758/CABN.8.4.429
中图分类号
B84 [心理学]; C [社会科学总论]; Q98 [人类学];
学科分类号
03 ; 0303 ; 030303 ; 04 ; 0402 ;
摘要
Decision making is a core competence for animals and humans acting and surviving in environments they only partially comprehend, gaining rewards and punishments for their troubles, Decision-theoretic concepts permeate experiments and computational models in ethology, psychology, and neuroscience. Here, we review a well-known, coherent Bayesian approach to decision making, showing how it unifies issues in Markovian decision problems, signal detection psychophysics, sequential sampling, and optimal exploration and discuss paradigmatic psychological and neural examples of each problem. We discuss computational issues concerning what subjects know about their task and how ambitious they are in seeking optimal solutions; we address algorithmic topics concerning model-based and model-free methods for making choices; and we highlight key aspects of the neural implementation of decision making.
引用
收藏
页码:429 / 453
页数:25
相关论文
共 108 条
[1]  
Ainslie G., 2001, Breakdown of Will
[2]  
[Anonymous], 2017, CHOOSE APPOINTMENTS
[3]   The role of the dorsal striatum in reward and decision-making [J].
Balleine, Bernard W. ;
Delgado, Mauricio R. ;
Hikosaka, Okihide .
JOURNAL OF NEUROSCIENCE, 2007, 27 (31) :8161-8165
[4]  
Barto AG., 1995, Models of information processing in the basal ganglia, P215
[5]   Bayesian integration of visual and auditory signals for spatial localization [J].
Battaglia, PW ;
Jacobs, RA ;
Aslin, RN .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2003, 20 (07) :1391-1397
[6]   Infinite-horizon policy-gradient estimation [J].
Baxter, J ;
Bartlett, PL .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 :319-350
[7]   Probabilistic population codes and the exponential family of distributions [J].
Beck, J. ;
Ma, W. J. ;
Latham, P. E. ;
Pouget, A. .
COMPUTATIONAL NEUROSCIENCE: THEORETICAL INSIGHTS INTO BRAIN FUNCTION, 2007, 165 :509-519
[8]   Exact inferences in a neural implementation of a hidden Markov model [J].
Beck, Jeffrey M. ;
Pouget, Alexandre .
NEURAL COMPUTATION, 2007, 19 (05) :1344-1361
[9]   DYNAMIC PROGRAMMING [J].
BELLMAN, R .
SCIENCE, 1966, 153 (3731) :34-&
[10]  
Berger JO., 1985, STAT DECISION THEORY, DOI DOI 10.1007/978-1-4757-4286-2