Model-free approximate dynamic programming schemes for linear systems

被引:6
作者
Al-Tamimi, Asma [1 ]
Vrabie, Draguna [1 ]
Abu-Khalaf, Murad [2 ]
Lewis, Frank L. [1 ]
机构
[1] Univ Texas Arlington, Automat & Robot Res Inst, 7300 Jack Newell Blvd S, Ft Worth, TX 76118 USA
[2] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA
来源
2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6 | 2007年
基金
美国国家科学基金会;
关键词
D O I
10.1109/IJCNN.2007.4370985
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present online model-free adaptive critic (AC) schemes based on approximate dynamic programming (ADP) to solve optimal control problems in both discrete-time and continuous-time domains for linear systems with unknown dynamics. In the discrete-time case, it is shown that the proposed ADP algorithm is in fact solving the underlying Generalized Algebraic Riccati Equation (GARE) of the corresponding optimal control problem or zero-sum game. In the continuous-time domain, an ADP scheme is introduced to solve for the underlying ARE of the optimal control problem. It is shown that this continuous-time ADP scheme is in fact a Quasi-Newton method to solve the ARE. In both time domains, the adaptive critic algorithms are easy to initialize since initial policies are not required to be stabilizing. It is also shown, on a power system control example, that both discrete-time and continuous-time approaches to ADP converge to the same continuous time optimal control solution provided that the utility function is appropriately chosen.
引用
收藏
页码:371 / +
页数:3
相关论文
共 35 条
[1]   Hamilton-Jacobi-Isaacs formulation for constrained input nonlinear systems [J].
Abu-Khalaf, M ;
Lewis, FL ;
Huang, J .
2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, :5034-5040
[2]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[3]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[4]  
ALTAMIMI A, 2007, IEEE T SYS MAN CYB B, V37
[5]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[6]  
Basar T., 1995, OPTIMAL CONTROL RELA
[7]  
Basar T., 1998, Dynamic noncooperative game theory
[8]  
Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[9]  
BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
[10]  
BREWER JW, 1978, IEEE T CIRCUIT SYSTE, V25