TARGET-LEVEL CRITERION IN MARKOV DECISION-PROCESSES

被引：38

作者：

BOUAKIZ, M

KEBIR, Y

机构：

[1] LOYOLA UNIV,DEPT MANAGEMENT SCI,CHICAGO,IL 60611

[2] LOYOLA UNIV,DEPT MATH SCI,CHICAGO,IL 60611

来源：

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS | 1995年 / 86卷 / 01期

关键词：

MARKOV DECISION PROCESSES; TARGET-LEVEL CRITERION; FIXED POINTS; DYNAMIC PROGRAMMING; SUCCESSIVE APPROXIMATIONS;

D O I：

10.1007/BF02193458

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programming equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.

引用

页码：1 / 15

页数：15

共 13 条

[1] DISCOUNTED MDP - DISTRIBUTION-FUNCTIONS AND EXPONENTIAL UTILITY MAXIMIZATION
CHUNG, KJ
SOBEL, MJ
[J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1987, 25 (01) : 49 - 62
[2] Dubins L.E., 1976, INEQUALITIES STOCHAS
[3] PERCENTILES AND MARKOVIAN DECISION-PROCESSES
FILAR, JA
[J]. OPERATIONS RESEARCH LETTERS, 1983, 2 (01) : 13 - 15
[4] HENIG MI, 1984, TARGENT PERCENTILE C
[5] HEYMAN DP, 1984, STOCHASTIC MODELS OP, V2
[6] PREFERENCE ORDER DYNAMIC PROGRAM FOR A STOCHASTIC TRAVELING SALESMAN PROBLEM
KAO, EPC
[J]. OPERATIONS RESEARCH, 1978, 26 (06) : 1033 - 1045
[7] KUMARASWAMY S, 1983, MANAGE SCI, V29, P512
[8] LAU HS, 1980, J OPERATIONAL RES SO, V26, P525
[9] RENDELMAN RJ, 1987, FINANCIAL ANAL J MAY, P27
[10] ROSS SM, 1970, APPLIED PROBABILITY

← 1 2 →