Semi-infinite Markov decision processes

被引:7
作者
Chen, M
Filar, JA
Liu, K
机构
[1] Univ Maryland Baltimore Cty, Dept Math & Stat, Baltimore, MD 21250 USA
[2] Univ S Australia, Sch Math, The Levels, SA 5095, Australia
[3] Acad Sinica, Inst Appl Math, Beijing 100080, Peoples R China
关键词
semi-infinite Markov decision processes; optimal strategy; epsilon-optimal;
D O I
10.1007/s001860050006
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper discounted and average Markov decision processes with finite state space and countable action set (semi-infinite MDP for short) are discussed. Without ordinary continuity and compactness conditions, for discounted semi-infinite MDP we have shown that by exploiting the results on semi-infinite linear programming due to Tijs [20] our semi-infinite discounted MDP can be approximated by a sequence of finite discounted MDPs and even in a semi-infinite discounted MDP it is sufficient to restrict ourselves to the class of deterministic stationary strategies. For average reward case we still prove that under some conditions the supremum in the class of general strategies is equivalent to the supremum in the class of deterministic stationary strategies. A counterexample shows that these conditions can not be easily relaxed.
引用
收藏
页码:115 / 137
页数:23
相关论文
共 21 条
  • [1] ALTMAN E, 1991, SIAM J CONTROL OPTIM, V37, P1415
  • [2] DISCRETE DYNAMIC-PROGRAMMING
    BLACKWELL, D
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02): : 719 - &
  • [3] CAVAZOSCADENA R, 1990, ANN OR, V28, P3
  • [4] A NOTE ON MEMORYLESS RULES FOR CONTROLLING SEQUENTIAL CONTROL PROCESSES
    DERMAN, C
    STRAUCH, RE
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (01): : 276 - &
  • [5] DERMAN C, 1970, FINITE STATE MARKOVI
  • [6] DONG Z, 1986, J GRADUATE SCH USTC, V3, P49
  • [7] Dynkin E.B., 1979, Grundlehren der Mathematischen Wissenschaften, V235
  • [8] FAINBERG EA, 1983, LECT NOTES MATH, V1021, P111
  • [9] COMMUNICATING MDPS - EQUIVALENCE AND LP PROPERTIES
    FILAR, JA
    SCHULTZ, TA
    [J]. OPERATIONS RESEARCH LETTERS, 1988, 7 (06) : 303 - 307
  • [10] GUBENKO LG, 1975, THEOR PROBAB MATH ST, V7, P47