共 9 条
- [1] Agrawal R.(1995)Sample mean based index policies with Advances in Applied Probability 27 1054-1078
- [2] Burnetas A.(1996)log Advances in Applied Mathematics 17 122-142
- [3] Katehakis M.(1994) regret for the multi-armed bandit problem Journal of Optimization Theory and Applications 83 113-154
- [4] Ishikida T.(1985)Optimal adaptive policies for sequential allocation problems Advances in Applied Mathematics 6 4-22
- [5] Varaiya P.(1991)Multi-armed bandit problem revisited Annals of Operations Research 28 297-312
- [6] Lai T.(undefined)Asymptotically efficient adaptive allocation rules undefined undefined undefined-undefined
- [7] Robbins H.(undefined)Nonparametric bandit methods undefined undefined undefined-undefined
- [8] Yakowitz S.(undefined)undefined undefined undefined undefined-undefined
- [9] Lowe W.(undefined)undefined undefined undefined undefined-undefined