Relative entropy in sequential decision problems

被引:18
作者
Lehrer, E [1 ]
Smorodinsky, R
机构
[1] Tel Aviv Univ, Raymond & Beverly Sackler Fac Exact Sci, Sch Math, IL-69978 Tel Aviv, Israel
[2] Technion Israel Inst Technol, IL-32000 Haifa, Israel
基金
美国国家科学基金会;
关键词
relative entropy; sequential decision problems; optimization;
D O I
10.1016/S0304-4068(99)00027-0
中图分类号
F [经济];
学科分类号
02 ;
摘要
Consider an agent who faces a sequential decision problem. At each stage the agent takes an action and observes a stochastic outcome (e.g., daily prices, weather conditions, opponents' actions in a repeated game, etc.). The agent's stage-utility depends on his action, the observed outcome and on previous outcomes. We assume the agent is Bayesian and is endowed with a subjective belief over the distribution of outcomes. The agent's initial belief is typically inaccurate. Therefore, his subjectively optimal strategy is initially suboptimal. As time passes information about the true dynamics is accumulated and, depending on the compatibility of the belief with respect to the truth, the agent may eventually learn to optimize. We introduce the notion of relative entropy, which is a natural adaptation of the entropy of a stochastic process to the subjective set-up. We present conditions, expressed in terms of relative entropy, that determine whether the agent will eventually learn to optimize. It is shown that low entropy yields asymptotic optimal behavior. In addition, we present a notion of pointwise merging and link it with relative entropy. (C) 2000 Elsevier Science S.A. All rights reserved.
引用
收藏
页码:425 / 439
页数:15
相关论文
共 18 条
[1]  
[Anonymous], EC THEORY
[2]   MERGING OF OPINIONS WITH INCREASING INFORMATION [J].
BLACKWELL, D ;
DUBINS, L .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (03) :882-&
[3]  
BLUME L, 1992, EC THOER WORKSH HON
[4]  
BOLLT ME, 1996, 1165 NW U
[5]  
Feller W., 1971, An introduction to probability theory and its applications, V2
[6]   WEAK AND STRONG MERGING OF OPINIONS [J].
KALAI, E ;
LEHRER, E .
JOURNAL OF MATHEMATICAL ECONOMICS, 1994, 23 (01) :73-86
[7]   RATIONAL LEARNING LEADS TO NASH EQUILIBRIUM [J].
KALAI, E ;
LEHRER, E .
ECONOMETRICA, 1993, 61 (05) :1019-1045
[8]  
KOLMOGOROV AN, 1958, DOKL AKAD NAUK SSSR+, V119, P861
[9]   Compatible measures and merging [J].
Lehrer, E ;
Smorodinsky, R .
MATHEMATICS OF OPERATIONS RESEARCH, 1996, 21 (03) :697-706
[10]   REPEATED GAMES WITH STATIONARY BOUNDED RECALL STRATEGIES [J].
LEHRER, E .
JOURNAL OF ECONOMIC THEORY, 1988, 46 (01) :130-144