Semi-infinite Markov decision processes

被引：7

作者：

Chen, M

Filar, JA

Liu, K

机构：

[1] Univ Maryland Baltimore Cty, Dept Math & Stat, Baltimore, MD 21250 USA

[2] Univ S Australia, Sch Math, The Levels, SA 5095, Australia

[3] Acad Sinica, Inst Appl Math, Beijing 100080, Peoples R China

来源：

MATHEMATICAL METHODS OF OPERATIONS RESEARCH | 2000年 / 51卷 / 01期

关键词：

semi-infinite Markov decision processes; optimal strategy; epsilon-optimal;

D O I：

10.1007/s001860050006

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

In this paper discounted and average Markov decision processes with finite state space and countable action set (semi-infinite MDP for short) are discussed. Without ordinary continuity and compactness conditions, for discounted semi-infinite MDP we have shown that by exploiting the results on semi-infinite linear programming due to Tijs [20] our semi-infinite discounted MDP can be approximated by a sequence of finite discounted MDPs and even in a semi-infinite discounted MDP it is sufficient to restrict ourselves to the class of deterministic stationary strategies. For average reward case we still prove that under some conditions the supremum in the class of general strategies is equivalent to the supremum in the class of deterministic stationary strategies. A counterexample shows that these conditions can not be easily relaxed.

引用

页码：115 / 137

页数：23

共 21 条

[1] ALTMAN E, 1991, SIAM J CONTROL OPTIM, V37, P1415
[2] DISCRETE DYNAMIC-PROGRAMMING
BLACKWELL, D
[J]. ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02): : 719 - &
[3] CAVAZOSCADENA R, 1990, ANN OR, V28, P3
[4] A NOTE ON MEMORYLESS RULES FOR CONTROLLING SEQUENTIAL CONTROL PROCESSES
DERMAN, C
STRAUCH, RE
[J]. ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (01): : 276 - &
[5] DERMAN C, 1970, FINITE STATE MARKOVI
[6] DONG Z, 1986, J GRADUATE SCH USTC, V3, P49
[7] Dynkin E.B., 1979, Grundlehren der Mathematischen Wissenschaften, V235
[8] FAINBERG EA, 1983, LECT NOTES MATH, V1021, P111
[9] COMMUNICATING MDPS - EQUIVALENCE AND LP PROPERTIES
FILAR, JA
SCHULTZ, TA
[J]. OPERATIONS RESEARCH LETTERS, 1988, 7 (06) : 303 - 307
[10] GUBENKO LG, 1975, THEOR PROBAB MATH ST, V7, P47

← 1 2 3 →