Why imitate, and if so, how? A boundedly rational approach to multi-armed bandits

被引：464

作者：

Schlag, KH ^{[1
]}

机构：

[1] Univ Bonn, D-53113 Bonn, Germany

来源：

JOURNAL OF ECONOMIC THEORY | 1998年 / 78卷 / 01期

关键词：

D O I：

10.1006/jeth.1997.2347

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

Individuals in a finite population repeatedly choose among actions yielding uncertain payoffs. Between choices, each individual observes the action and realized outcome of one other individual. We restrict our search to learning rules with limited memory that increase expected payoffs regardless of the distribution underlying their realizations. It is shown that the rule that outperforms all others is that which imitates the action of an observed individual (whose realized outcome is better than self) with a probability proportional to the difference in these realizations. When each individual uses this best rule, the aggregate population behavior is approximated by the replicator dynamic. (C) 1998 Academic Press.

引用

页码：130 / 156

页数：27

共 26 条

[1] A SIMPLE-MODEL OF HERD BEHAVIOR [J].