共 10 条
[1]
Approximate policy iteration:a survey and somenew methods[J]. Dimitri P.BERTSEKAS.Journal of Control Theory and Applications. 2011(03)
[2]
An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs[J] . Derong Liu,Ding Wang,Xiong Yang.Information Sciences . 2013
[6]
Least Squares Policy Evaluation Algorithms with Linear Function Approximation[J] . Discrete Event Dynamic Systems . 2003 (1)
[7]
Kernel-Based Reinforcement Learning[J] . Machine Learning . 2002 (2)
[8]
Technical Update: Least-Squares Temporal Difference Learning[J] . Justin A. Boyan.Machine Learning . 2002 (2)
[9]
Linear Least-Squares Algorithms for Temporal Difference Learning[J] . Steven J. Bradtke,Andrew G. Barto.Machine Learning . 1996 (1)
[10]
Sparse temporal difference learning using LASSO .2 M. Loth,M. Davy,P. Preux. IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning . 2007