STATE-OF-THE-ART - A SURVEY OF PARTIALLY OBSERVABLE MARKOV DECISION-PROCESSES - THEORY, MODELS, AND ALGORITHMS

被引:508
作者
MONAHAN, GE
机构
关键词
COMPUTER PROGRAMMING - Subroutines;
D O I
10.1287/mnsc.28.1.1
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This study surveys models and algorithms dealing with partially observable Markov decision processes. A partially observable Markov decision process (POMDP)is a generalization of a Markov decision process which permits uncertainty regarding the state of a Markov process and allows for state information acquisition. A general framework for finite state and action POMDP's is presented. There is also a brief discussion of the development of POMDP's and their relationship with other decision processes. A wide range of models in such areas as quality control, machine maintenance, internal auditing, learning, and optimal stopping are discussed within the POMDP-framework. Algorithms for computing optimal solutions to POMDP's are presented.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 87 条
[1]   STRUCTURAL RESULTS FOR PARTIALLY OBSERVABLE MARKOV DECISION-PROCESSES [J].
ALBRIGHT, SC .
OPERATIONS RESEARCH, 1979, 27 (05) :1041-1053
[2]  
Anderson R. F., 1977, Mathematics of Operations Research, V2, P155, DOI 10.1287/moor.2.2.155
[3]  
Anderson R. F., 1978, Mathematics of Operations Research, V3, P67, DOI 10.1287/moor.3.1.67
[4]   OPTIMAL CONTROL OF PARTIALLY OBSERVABLE MARKOVIAN SYSTEMS [J].
AOKI, M .
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 1965, 280 (05) :367-&
[5]  
Aoki M., 1967, OPTIMIZATION STOCHAS
[7]   OPTIMAL CONTROL OF MARKOV PROCESSES WITH INCOMPLETE STATE-INFORMATION .2. CONVEXITY OF LOSSFUNCTION [J].
ASTROM, KJ .
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1969, 26 (02) :403-&
[8]  
BLACKWELL D, 1957, INFORMATION THEORY S
[9]  
BREIMAN L, 1964, APPLIED COMBINATORIA, pCH10
[10]   MARKOV DECISION PROCESSES WITH STATE INOFRMATION LAG [J].
BROOKS, DM ;
LEONDES, CT .
OPERATIONS RESEARCH, 1972, 20 (04) :904-&