Dynamic programming and suboptimal control: A survey from ADP to MPC

被引:165
作者
Bertsekas, DP [1 ]
机构
[1] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
关键词
dynamic programming; stochastic optimal control; model predictive control; rollout algorithm;
D O I
10.3166/ejc.11.310-334
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We survey some recent research directions within the field of approximate dynamic programming, with a particular emphasis on rollout algorithms and model predictive control (MPC). We argue that while they are motivated by different concerns, these two methodologies are closely connected, and the mathematical essence of their desirable properties (cost improvement and stability, respectively) is couched on the central dynamic programming idea of policy iteration. In particular, among other things, we show that the most common MPC schemes can be viewed as rollout algorithms and are related to policy iteration methods. Furthermore, we embed rollout and MPC within a new unifying suboptimal control framework, based on a concept of restricted or constrained structure policies, which contains these schemes as special cases.
引用
收藏
页码:310 / 334
页数:25
相关论文
共 57 条
[1]   EXPECTED-OUTCOME - A GENERAL-MODEL OF STATIC EVALUATION [J].
ABRAMSON, B .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (02) :182-193
[2]  
[Anonymous], 1965, DYNAMIC PROGRAMMING
[3]  
BARTO AG, 2004, LEARNING APPROXIMATE
[4]  
Bertsekas D., 2005, 2646 MIT LAB INF DEC
[5]  
Bertsekas D., 2005, DYNAMIC PROGRAMMING
[6]   Rollout Algorithms for Combinatorial Optimization [J].
Bertsekas D.P. ;
Tsitsiklis J.N. ;
Wu C. .
Journal of Heuristics, 1997, 3 (3) :245-262
[7]  
Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[8]  
Bertsekas D. P., 1971, THESIS MIT CAMBRIDGE
[10]   MINIMAX REACHABILITY OF TARGET SETS AND TARGET TUBES [J].
BERTSEKAS, DP ;
RHODES, IB .
AUTOMATICA, 1971, 7 (02) :233-+