共 19 条
[1]
[2]
[3]
Exploring Deep Reinforcement Learning with Multi Q-Learning[J] Ethan Duryea;Michael Ganger;Wei Hu Intelligent Control and Automation 2016,
[4]
Real-time reinforcement learning by sequential Actor–Critics and experience replay[J] Neural Networks 2009,
[5]
Long Short-Term Memory[J] Sepp Hochreiter;Jürgen Schmidhuber Neural Computation 1997,
[6]
Simple statistical gradient-following algorithms for connectionist reinforcement learning[J] Ronald J. Williams Machine Learning 1992,
[7]
Self-Improving Reactive Agents Based on Reinforcement Learning; Planning and Teaching[J] Long-Ji Lin Machine Learning 1992,
[8]
Q -learning[J] Christopher J. C. H. Watkins;Peter Dayan Machine Learning 1992,
[9]
Learning to Predict by the Methods of Temporal Differences[J] Richard S. Sutton Machine Learning 1988,
[10]
An informationtheoretic optimality principle for deep reinforcement learning Leibfried F;Graumoya J;Bouammar H; 2017,

