STRUCTURAL RESULTS FOR PARTIALLY OBSERVABLE MARKOV DECISION-PROCESSES

被引:40
作者
ALBRIGHT, SC
机构
关键词
D O I
10.1287/opre.27.5.1041
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
An examination is made of monotonicity results for a fairly general class of partially observable Markov decision processes. When there are only two actual states in the system and when the actions taken are primarily intended to improve the system, rather than to inspect it, we give reasonable conditions which ensure that the optimal reward function and the optimal action are both monotone in the current state of information. Examples of maintenance systems and advertising systems for which our results hold are given. Also examined is the case where there are three or more actual states and indicate the difficulties encountered when attempting to extend the monotonicity results to this situation.
引用
收藏
页码:1041 / 1053
页数:13
相关论文
共 22 条
[1]  
ALBRIGHT SC, 1976, MARKOV DECISION MODE
[2]   OPTIMAL CONTROL OF PARTIALLY OBSERVABLE MARKOVIAN SYSTEMS [J].
AOKI, M .
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 1965, 280 (05) :367-&
[4]   OPTIMAL CONTROL OF MARKOV PROCESSES WITH INCOMPLETE STATE-INFORMATION .2. CONVEXITY OF LOSSFUNCTION [J].
ASTROM, KJ .
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1969, 26 (02) :403-&
[5]  
Blackwell D., 1965, ANN MATH STAT, V36, P226
[6]   CONTRACTION MAPPINGS IN THEORY UNDERLYING DYNAMIC PROGRAMMING [J].
DENARDO, EV .
SIAM REVIEW, 1967, 9 (02) :165-&
[7]  
DERMAN C, 1963, MATH OPTIMIZATION TE
[8]  
HOCKSTRA DJ, 1974, PARTIALLY OBSERVABLE
[9]  
HOCKSTRA DJ, 1973, 156 STANF U DEP OP R
[10]  
PLATZMAN LK, 1977, THESIS MIT