共 9 条
[3]
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker[J] . Matej Moravc?c?i?k,Martin Schmid,Neil Burch,Viliam Lisc?i?y?,Dustin Morrill,Nolan Bard,Trevor Davis,Kevin Waugh,Michael Johanson,Michael Bowling.Science . 2017 (6337)
[4]
Self-teaching adaptive dynamic programming for Gomoku[J] . Dongbin Zhao,Zhen Zhang,Yujie Dai.Neurocomputing . 2011 (1)
[5]
Deepmind lab .2 BEATTIE C,LEIBO J Z,TEPLYASHIN D,et al. . 2016
[6]
Rainbow:combining improvements in deep reinforcement learning .2 HESSEL M,MODAYIL J,VAN HASSELT H et al. . 2017
[7]
Prioritized experience replay .2 Schaul T,Quan J,Antonoglou I,Silver D. Proceedings of the 4th International Conference on Learning Representations . 2016
[8]
Learning continuous control policies by stochastic value gradients .2 Heess N,Wayne G,Silver D,et al. Advances in Neural Information Processing Systems . 2015
[9]
PathNet:evolution channels gradient descent in super neural networks .2 Fernando C,Banarse D,Blundell C,et al. . 2017