共 10 条
- [4] A fast learning algorithm for deep belief nets [J]. NEURAL COMPUTATION, 2006, 18 (07) : 1527 - 1554
- [6] Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning[J] . Ronald J. Williams.Machine Learning . 1992 (3)
- [7] Prioritized experience replay .2 Schaul T,Quan J,Antonoglou I,Silver D. Proceedings of the 4th International Conference on Learning Representations . 2016
- [8] End-to-end training of deep visuomotor policies .2 LEVINE S,FINN C,DARRELL T,et al. Journal of Machine Learning Research . 2016
- [9] Reinforcement learning with unsupervised auxiliary tasks .2 JADERBERG M,MNIH V,CZARNECKI W,et al. https://arxiv.org/abs/ 1611.05397 .
- [10] Deep reinforcement learning for dialogue generation .2 LI J,MONROE W,RITTER A,et al. https://arxiv.org/abs/ 1707.06347 .