Planning and acting in partially observable stochastic domains

被引:2065
作者
Kaelbling, LP
Littman, ML
Cassandra, AR
机构
[1] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
[2] Duke Univ, Dept Comp Sci, Durham, NC 27708 USA
[3] Microelect & Comp Technol Corp, Austin, TX 78759 USA
关键词
planning; uncertainty; partially observable Markov decision processes;
D O I
10.1016/S0004-3702(98)00023-X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (MDPs) and partially observable MDPs (POMDPs). We then outline a novel algorithm for solving PO,MDPs off line and show how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP. We conclude with a discussion of how our approach relates to previous work, the complexity of finding exact solutions to POMDPs, and of some possibilities for finding approximate solutions. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:99 / 134
页数:36
相关论文
共 69 条
[21]  
GOLDMAN RP, 1994, 2 INT C ART INT PLAN, P80
[22]  
HADDAWY P, 1993, 930604 U WASH
[23]  
HANSEN EA, 1994, PROCEEDINGS OF THE TWELFTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P1029
[24]  
HANSEN EA, 1998, ADV NEURAL INFORMATI, P10
[25]   INFORMATION VALUE THEORY [J].
HOWARD, RA .
IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, 1966, SSC2 (01) :22-&
[26]  
Howard RA., 1960, Dynamic Programming and Markov Processes
[27]  
Kalman R.E., 1960, J. Basic Eng., V82, P35
[28]  
KOENIG S, 1994, MOR KAUF R, P363
[29]  
KOENIG S, 1992, UCBCSD92685 BERK
[30]  
Koza JR, 1992, Genetic programming