共 2 条
[1]
Technical Note: Q-Learning[J] . Christopher J.C.H. Watkins,Peter Dayan.Machine Learning . 1992 (3)
[2]
Learning to predict by the methods of temporal differences[J] . Richard S. Sutton.Machine Learning . 1988 (1)