共 2 条
[1]
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning[J] . Richard S. Sutton,Doina Precup,Satinder Singh.Artificial Intelligence . 1999 (1)
[2]
A Heuristic Approach to the Discovery of Macro-Operators[J] . Glenn A. Iba.Machine Learning . 1989 (4)