Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration

被引:727
作者
Cohen, Jonathan D. [1 ]
McClure, Samuel M.
Yu, Angela J.
机构
[1] Princeton Univ, Dept Psychol, Princeton, NJ 08540 USA
[2] Princeton Univ, Ctr Study Brain Mind & Behav, Princeton, NJ 08540 USA
[3] Univ Pittsburgh, Dept Psychiat, Pittsburgh, PA 15213 USA
关键词
exploration; uncertainty; learning; neurotransmitters; prefrontal cortex; decision making;
D O I
10.1098/rstb.2007.2098
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many large and small decisions we make in our daily lives - which ice cream to choose, what research projects to pursue, which partner to marry - require an exploration of alternatives before committing to and exploiting the benefits of a particular choice. Furthermore, many decisions require re-evaluation, and further exploration of alternatives, in the face of changing needs or circumstances. That is, often our decisions depend on a higher level choice: whether to exploit well known but possibly suboptimal alternatives or to explore risky but potentially more profitable ones. How adaptive agents choose between exploitation and exploration remains an important and open question that has received relatively limited attention in the behavioural and brain sciences. The choice could depend on a number of factors, including the familiarity of the environment, how quickly the environment is likely to change and the relative value of exploiting known sources of reward versus the cost of reducing uncertainty through exploration. There is no known generally optimal solution to the exploration versus exploitation problem, and a solution to the general case may indeed not be possible. However, there have been formal analyses of the optimal policy under constrained circumstances. There have also been specific suggestions of how humans and animals may respond to this problem under particular experimental conditions as well as proposals about the brain mechanisms involved. Here, we provide a brief review of this work, discuss how exploration and exploitation may be mediated in the brain and highlight some promising future directions for research.
引用
收藏
页码:933 / 942
页数:10
相关论文
共 51 条
  • [1] SPECIOUS REWARD - BEHAVIORAL THEORY OF IMPULSIVENESS AND IMPULSE CONTROL
    AINSLIE, G
    [J]. PSYCHOLOGICAL BULLETIN, 1975, 82 (04) : 463 - 496
  • [2] ALLPORT A, 1994, ATTENTION PERFORM, V15, P421
  • [3] An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance
    Aston-Jones, G
    Cohen, JD
    [J]. ANNUAL REVIEW OF NEUROSCIENCE, 2005, 28 : 403 - 450
  • [4] Aston-Jones G., 2002, SOC NEUR ABSTR, V2002, P86
  • [5] Conditioned responses of monkey locus coeruleus neurons anticipate acquisition of discriminative behavior in a vigilance task
    AstonJones, G
    Rajkowski, J
    Kubiak, P
    [J]. NEUROSCIENCE, 1997, 80 (03) : 697 - 715
  • [6] SWITCHING COSTS AND THE GITTINS INDEX
    BANKS, JS
    SUNDARAM, RK
    [J]. ECONOMETRICA, 1994, 62 (03) : 687 - 694
  • [7] Berry D. A., 1985, BANDIT PROBLEMS SEQU
  • [8] Conflict monitoring and cognitive control
    Botvinick, MM
    Braver, TS
    Barch, DM
    Carter, CS
    Cohen, JD
    [J]. PSYCHOLOGICAL REVIEW, 2001, 108 (03) : 624 - 652
  • [9] Simple neural networks that optimize decisions
    Brown, E
    Gao, J
    Holmes, P
    Bogacz, R
    Gilzenrat, M
    Cohen, JD
    [J]. INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2005, 15 (03): : 803 - 826
  • [10] Carstensen LL, 1999, AM PSYCHOL, V54, P165