On the convergence of reinforcement learning

被引：115

作者：

Beggs, AW ^{[1
]}

机构：

[1] Univ Oxford Wadham Coll, Oxford OX1 3PN, England

来源：

JOURNAL OF ECONOMIC THEORY | 2005年 / 122卷 / 01期

关键词：

reinforcement learning; games;

D O I：

10.1016/j.jet.2004.03.008

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

This paper examines the convergence of payoffs and strategies in Erev and Roth's model of reinforcement learning. When all players use this rule it eliminates iteratively dominated strategies and in two-person constant-sum games average payoffs converge to the value of the game. Strategies converge in constant-sum games with unique equilibria if they are pure or if they are mixed and the game is 2 x 2. The long-run behaviour of the learning rule is governed by equations related to Maynard Smith's version of the replicator dynamic. Properties of the learning rule against general opponents are also studied. (c) 2004 Elsevier Inc. All rights reserved.

引用

页码：1 / 36

页数：36

共 39 条

[1]

[Anonymous], [No title captured], DOI DOI 10.1007/BF01199986

[2]

AUER P, 1998, UNPUB GAMBLING RIGGE

[3]

Benaïm M, 1999, LECT NOTES MATH, V1709, P1

[4] Mixed equilibria and dynamical systems arising from fictitious play in perturbed games [J].