共 32 条
- [3] Learning from uniformly ergodic Markov chains [J]. JOURNAL OF COMPLEXITY, 2009, 25 (02) : 188 - 200
- [7] Technical update: Least-squares temporal difference learning [J]. MACHINE LEARNING, 2002, 49 (2-3) : 233 - 246
- [9] Elevator Group Control Using Multiple Reinforcement Learning Agents.[J].Robert H. Crites;Andrew G. Barto.Machine Learning.1998, 2
- [10] Incremental multi-step Q-learning.[J].Jing Peng;Ronald J. Williams.Machine Learning.1996, 1