CONSTRAINED MARKOV DECISION-MODELS WITH WEIGHTED DISCOUNTED REWARDS

被引：48

作者：

FEINBERG, EA ^{[1
]}

SHWARTZ, A ^{[1
]}

机构：

[1] TECHNION ISRAEL INST TECHNOL,DEPT ELECT ENGN,IL-32000 HAIFA,ISRAEL

来源：

MATHEMATICS OF OPERATIONS RESEARCH | 1995年 / 20卷 / 02期

关键词：

MARKOV DECISION PROCESSES; ADDITIONAL CONSTRAINTS; SEVERAL DISCOUNT FACTORS;

D O I：

10.1287/moor.20.2.302

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

This paper deals with constrained optimization of Markov Decision Processes. Both objective function and constraints are sums of standard discounted rewards, but each with a different discount factor. Such models arise, e.g., in production and in applications involving multiple time scales. We prove that if a feasible policy exists, then there exists an optimal policy which is (i) stationary (nonrandomized) from some step onward, (ii) randomized Markov before this step, but the total number of actions which are added by randomization is bounded by the number of constraints. Optimality of such policies for multi-criteria problems is also established. These new policies have the pleasing aesthetic property that the amount of randomization they require over any trajectory is restricted by the number of constraints. This result is new even for constrained optimization with a single discount factor, where the optimality of randomized stationary policies is known. However, a randomized stationary policy may require an infinite number of randomizations over time. We also formulate a linear programming algorithm for approximate solutions of constrained weighted discounted models.

引用

页码：302 / 320

页数：19

共 27 条

[1]

Altman E., 1993, ZOR, Methods and Models of Operations Research, V37, P151, DOI 10.1007/BF01414154

[2] DENUMERABLE CONSTRAINED MARKOV DECISION-PROCESSES AND FINITE APPROXIMATIONS [J].

ALTMAN, E .

MATHEMATICS OF OPERATIONS RESEARCH, 1994, 19 (01) :169-191

[3]

Altman E., 1991, Annals of Operations Research, V32, P1, DOI 10.1007/BF02204825

[4] ADAPTIVE CONTROL OF CONSTRAINED MARKOV CHAINS: CRITERIA AND POLICIES [J].

Altman, Eitan ;

Shwartz, Adam .

ANNALS OF OPERATIONS RESEARCH, 1991, 28 (01) :101-134

[5]

[Anonymous], MATH CTR TRACTS

[6] A NOTE ON MEMORYLESS RULES FOR CONTROLLING SEQUENTIAL CONTROL PROCESSES [J].

DERMAN, C ;

STRAUCH, RE .

ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (01) :276-&

[7] SOME REMARKS ON FINITE HORIZON MARKOVIAN DECISION MODELS [J].

DERMAN, C ;

KLEIN, M .

OPERATIONS RESEARCH, 1965, 13 (02) :272-&

[8]

Feinberg E. A, 1982, TH PROBABILITY ITS A, P486

[9]

FENBERG EA, 1994, MATH OPER RES, V19, P152

[10]

FERNANDEZGAUCHE.E, 1994, Z OPER RES, V39, P131

← 1 2 3 →