共 10 条
[1]
强化学习理论及应用[M]. 哈尔滨工程大学出版社 , 张汝波编著, 2001
[2]
Recent Advances in Hierarchical Reinforcement Learning[J] . Andrew G. Barto,Sridhar Mahadevan.Discrete Event Dynamic Systems . 2003 (1)
[4]
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning[J] . Richard S. Sutton,Doina Precup,Satinder Singh.Artificial Intelligence . 1999 (1)
[6]
A feedback control structure for on-line learning tasks[J] . Manfred Huber,Roderic A. Grupen.Robotics and Autonomous Systems . 1997 (3)
[8]
Q -learning[J] . Christopher J. C. H. Watkins,Peter Dayan.Machine Learning . 1992 (3)
[9]
Learning to predict by the methods of temporal differences[J] . Richard S. Sutton.Machine Learning . 1988 (1)
[10]
The Complexity of Decentralized Control of Markov Decision Processes .2 Bernstein D,Zilberstein S,Immerman N. Proc of the 16th Conference on Uncertainty in Artificial Intelligence . 2000