Cortical substrates for exploratory decisions in humans

被引:1464
作者
Daw, Nathaniel D.
O'Doherty, John P.
Dayan, Peter
Seymour, Ben
Dolan, Raymond J.
机构
[1] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
[2] UCL, Wellcome Dept Imaging Neurosci, London WC1N 3BG, England
基金
英国惠康基金;
关键词
D O I
10.1038/nature04766
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 [理学]; 0710 [生物学]; 09 [农学];
摘要
Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this 'exploration-exploitation' dilemma(1), a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous ( and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning(2) (RL) theory, indicates that a dopaminergic(3,4), striatal(5-9) and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical(1) and ethological(10) perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.
引用
收藏
页码:876 / 879
页数:4
相关论文
共 30 条
[1]
Midbrain dopamine neurons encode a quantitative reward prediction error signal [J].
Bayer, HM ;
Glimcher, PW .
NEURON, 2005, 47 (01) :129-141
[2]
The role of frontopolar cortex in subgoal processing during working memory [J].
Braver, TS ;
Bongiolatti, SR .
NEUROIMAGE, 2002, 15 (03) :523-536
[3]
The cognitive and neuroanatomical correlates of multitasking [J].
Burgess, PW ;
Veitch, E ;
Costello, AD ;
Shallice, T .
NEUROPSYCHOLOGIA, 2000, 38 (06) :848-863
[4]
OPTIMAL FORAGING, MARGINAL VALUE THEOREM [J].
CHARNOV, EL .
THEORETICAL POPULATION BIOLOGY, 1976, 9 (02) :129-136
[5]
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control [J].
Daw, ND ;
Niv, Y ;
Dayan, P .
NATURE NEUROSCIENCE, 2005, 8 (12) :1704-1711
[6]
Tracking the hemodynamic responses to reward and punishment in the striatum [J].
Delgado, MR ;
Nystrom, LE ;
Fissell, C ;
Noll, DC ;
Fiez, JA .
JOURNAL OF NEUROPHYSIOLOGY, 2000, 84 (06) :3072-3077
[7]
Activity in posterior parietal cortex is correlated with the relative subjective desirability of action [J].
Dorris, MC ;
Glimcher, PW .
NEURON, 2004, 44 (02) :365-378
[8]
Metalearning and neuromodulation [J].
Doya, K .
NEURAL NETWORKS, 2002, 15 (4-6) :495-506
[9]
Gittins John, 1974, PROGR STAT, P241266
[10]
Encoding predictive reward value in human amygdala and orbitofrontal cortex [J].
Gottfried, JA ;
O'Doherty, J ;
Dolan, RJ .
SCIENCE, 2003, 301 (5636) :1104-1107