共 6 条
[2]
Technical Note: Q-Learning.[J].Christopher J.C.H. Watkins;Peter Dayan.Machine Learning.1992, 3
[3]
Learning to predict by the methods of temporal differences.[J].Richard S. Sutton.Machine Learning.1988, 1
[4]
[5]
[6]

