Reinforcement learning rules in a repeated game

被引:13
作者
Bell A.M. [1 ]
机构
[1] Orbital Sciences, NASA Ames Research Center, Mail Stop 239, Moffett Field
关键词
Complex systems; Learning in games; Reinforcement learning;
D O I
10.1023/A:1013818611576
中图分类号
学科分类号
摘要
This paper examines the performance of simple reinforcement learning algorithms in a stationary environment and in a repeated game where the environment evolves endogenously based on the actions of other agents. Some types of reinforcement learning rules can be extremely sensitive to small changes in the initial conditions, consequently, events early in a simulation can affect the performance of the rule over a relatively long time horizon. However, when multiple adaptive agents interact, algorithms that performed poorly in a stationary environment often converge rapidly to a stable aggregate behaviors despite the slow and errative behavior of individual learners. Algorithms that are robust in stationary environments can exhibit slow convergence in an evolving environment.
引用
收藏
页码:89 / 110
页数:21
相关论文
共 15 条
[1]  
Arthur W.B., Inductive reasoning and bounded rationality: The El Farol problem, American Economic Association Papers and Proceedings, 84, pp. 406-411, (1994)
[2]  
Bell A.M., Sethares W.A., Bucklew J.A., Coordination Failure As a Source of Congestion in Information Networks, (1999)
[3]  
Bush R., Mosteller F., Stochastic Models for Learning, (1955)
[4]  
Erev I., Rapoport A., Coordination, 'Magic', and Reinforcement Learning in a Market Entry Game, (1997)
[5]  
Er'ev I., Roth A.E., Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria, American Economic Review, 8, 4, pp. 848-881, (1998)
[6]  
Fogel D.B., Chellapilla K., Angeline P.J., Inductive reasoning and bounded rationality reconsidered, IEEE Transactions on Evolutionary Computation, 3, 2, pp. 142-146, (1999)
[7]  
Fudenberg D., Levine D.K., The Theory of Learning in Games, (1998)
[8]  
Gary-Bobo R., On the existence of equilibrium points in a class of asymmetric market entry games, Games and Economic Behavior, 2, pp. 239-246, (1990)
[9]  
Kephart J.O., Hogg T., Huberman B.A., Dynamics of computational, Physical Review A, 40, 1, pp. 404-421, (1989)
[10]  
Rapoport A., Seale D., Winter E., An Experimental Study of Coordination and Learning in Iterated Two-market Entry Games, (1998)