ALGORITHMS FOR SINGULARLY PERTURBED LIMITING AVERAGE MARKOV CONTROL-PROBLEMS

被引：39

作者：

ABBAD, M

FILAR, JA

BIELECKI, TR

机构：

[1] UNIV MARYLAND,DEPT MATH & STAT,CATONSVILLE,MD 21228

[2] UNIV MARYLAND,DEPT MATH,CATONSVILLE,MD 21228

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 1992年 / 37卷 / 09期

关键词：

Aggregation-disaggregation algorithm - Limit control principle - Limiting average reward criterion - Markov control problems - Markov decision process - Wolfe-Dantzig structure;

D O I：

10.1109/9.159585

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider a singularly perturbed Markov decision process with the limiting average reward criterion. We assume that the underlying process is composed of n separate irreducible processes, and that the small perturbation is such that it "unites" these processes into a single irreducible process. We present two algorithms for the solution of the underlying "limit Markov control problem." The first of these is a linear program possessing the Wolfe-Dantzig structure inherited from the ergodic "nearly-decomposable" assumption in the model. The second is an aggregation-disaggregation policy improvement algorithm.

引用

页码：1421 / 1425

页数：5

共 17 条

[1]

ABBAD M, IN PRESS IEEE T AUTO

[2]

ABBAD M, 1991, THESIS U MARYLAND BA

[3]

ALDHAHERI R, 1989, 28TH P CDC IEEE, P1277

[4]

[Anonymous], 2012, DYNAMIC PROGRAMMING

[5]

BIELECKI TR, IN PRESS ANN OR

[6]

CORDECH M, 1983, IEEE T AUTOMATIC CON, V28, P1017

[7] OPTIMAL-CONTROL OF MARKOV-CHAINS ADMITTING STRONG AND WEAK-INTERACTIONS [J].

DELEBECQUE, F ;

QUADRAT, JP .

AUTOMATICA, 1981, 17 (02) :281-296

[8]

DELEBECQUE F, 1983, SIAM J APPL MATH, V48, P325

[9]

Derman C, 1970, FINITE STATE MARKOVI

[10]

Kallenberg Lodewijk C.M., 1983, MATH CTR TRACTS, V148

← 1 2 →