共 42 条
[1]
[Anonymous], APPL MATH STOCHASTIC
[2]
[Anonymous], P 10 YAL WORKSH AD L
[3]
[Anonymous], 2003, J MACH LEARN RES, DOI DOI 10.1162/JMLR.2003.4.6.1107
[4]
Anthony M., 1999, Neural Network Learning: Theoretical Foundations, V9
[5]
ANTOS A, 2007, IN PRESS ADV NEURAL
[6]
ANTOS A, 2007, IEEE S APPR DYN PROG, P330
[7]
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
[J].
LEARNING THEORY, PROCEEDINGS,
2006, 4005
:574-588
[8]
Baraud Y, 2001, ANN STAT, V29, P839
[9]
Bellman R., 1959, Mathematics of Computation, V13, P247
[10]
Bertsekas D. P., 1996, Neuro-dynamic programming