ALGORITHMS FOR SINGULARLY PERTURBED LIMITING AVERAGE MARKOV CONTROL-PROBLEMS

被引:39
作者
ABBAD, M
FILAR, JA
BIELECKI, TR
机构
[1] UNIV MARYLAND,DEPT MATH & STAT,CATONSVILLE,MD 21228
[2] UNIV MARYLAND,DEPT MATH,CATONSVILLE,MD 21228
关键词
Aggregation-disaggregation algorithm - Limit control principle - Limiting average reward criterion - Markov control problems - Markov decision process - Wolfe-Dantzig structure;
D O I
10.1109/9.159585
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider a singularly perturbed Markov decision process with the limiting average reward criterion. We assume that the underlying process is composed of n separate irreducible processes, and that the small perturbation is such that it "unites" these processes into a single irreducible process. We present two algorithms for the solution of the underlying "limit Markov control problem." The first of these is a linear program possessing the Wolfe-Dantzig structure inherited from the ergodic "nearly-decomposable" assumption in the model. The second is an aggregation-disaggregation policy improvement algorithm.
引用
收藏
页码:1421 / 1425
页数:5
相关论文
共 17 条
[1]  
ABBAD M, IN PRESS IEEE T AUTO
[2]  
ABBAD M, 1991, THESIS U MARYLAND BA
[3]  
ALDHAHERI R, 1989, 28TH P CDC IEEE, P1277
[4]  
[Anonymous], 2012, DYNAMIC PROGRAMMING
[5]  
BIELECKI TR, IN PRESS ANN OR
[6]  
CORDECH M, 1983, IEEE T AUTOMATIC CON, V28, P1017
[7]   OPTIMAL-CONTROL OF MARKOV-CHAINS ADMITTING STRONG AND WEAK-INTERACTIONS [J].
DELEBECQUE, F ;
QUADRAT, JP .
AUTOMATICA, 1981, 17 (02) :281-296
[8]  
DELEBECQUE F, 1983, SIAM J APPL MATH, V48, P325
[9]  
Derman C, 1970, FINITE STATE MARKOVI
[10]  
Kallenberg Lodewijk C.M., 1983, MATH CTR TRACTS, V148