Finding the K best policies in a finite-horizon Markov decision process

被引:13
作者
Nielsen, Lars Relund [1 ]
Kristensen, Anders Ringgaard [1 ]
机构
[1] Royal Vet & Agr Univ, Dept Large Anim Sci, DK-1870 Frederiksberg C, Denmark
关键词
finite-horizon Markov decision processes; stochastic dynamic programming; directed hypergraphs; hyperpaths; K best policies;
D O I
10.1016/j.ejor.2005.06.011
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Directed hypergraphs represent a general modelling and algorithmic tool, which have been successfully used in many different research areas such as artificial intelligence, database systems, fuzzy systems, propositional logic and transportation networks. However, modelling Markov decision processes using directed hypergraphs has not yet been considered. In this paper we consider finite-horizon Markov decision processes (MDPs) with finite state and action space and present an algorithm for finding the K best deterministic Markov policies. That is, we are interested in ranking the first K deterministic Markov policies in non-decreasing order using an additive criterion of optimality. The algorithm uses a directed hypergraph to model the finite-horizon MDP. It is shown that the problem of finding the optimal policy can be formulated as a minimum weight hyperpath problem and be solved in linear time, with respect to the input data representing the MDP, using different additive optimality criteria. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:1164 / 1179
页数:16
相关论文
共 28 条
[1]  
Altman E., 1999, STOCH MODEL SER
[2]   ADAPTIVE CONTROL OF CONSTRAINED MARKOV CHAINS: CRITERIA AND POLICIES [J].
Altman, Eitan ;
Shwartz, Adam .
ANNALS OF OPERATIONS RESEARCH, 1991, 28 (01) :101-134
[3]  
Ausiello G, 1998, LECT NOTES COMPUT SC, V1450, P1, DOI 10.1007/BFb0055754
[4]   MINIMAL REPRESENTATION OF DIRECTED HYPERGRAPHS [J].
AUSIELLO, G ;
DATRI, A ;
SACCA, D .
SIAM JOURNAL ON COMPUTING, 1986, 15 (02) :418-431
[5]  
Ausiello G., 2001, IT C THEOR COMP SCI, P312
[6]   DYNAMIC PROGRAMMING [J].
BELLMAN, R .
SCIENCE, 1966, 153 (3731) :34-&
[7]  
Berge C, 1973, GRAPHS HYPERGRAPHS
[8]   CONSTRAINED MARKOV DECISION CHAINS [J].
DERMAN, C ;
VEINOTT, AF .
MANAGEMENT SCIENCE SERIES B-APPLICATION, 1972, 19 (04) :389-390
[9]   SOME REMARKS ON FINITE HORIZON MARKOVIAN DECISION MODELS [J].
DERMAN, C ;
KLEIN, M .
OPERATIONS RESEARCH, 1965, 13 (02) :272-&
[10]   DIRECTED HYPERGRAPHS AND APPLICATIONS [J].
GALLO, G ;
LONGO, G ;
PALLOTTINO, S ;
NGUYEN, S .
DISCRETE APPLIED MATHEMATICS, 1993, 42 (2-3) :177-201