Rollout Algorithms for Combinatorial Optimization

被引:153
作者
Bertsekas D.P. [1 ]
Tsitsiklis J.N. [1 ]
Wu C. [1 ]
机构
[1] Dept. of Elec. Eng. and Comp. Sci., M.I.T., Cambridge
基金
美国国家科学基金会;
关键词
Destination Node; Heuristic Algorithm; Spare Part; Typical Step; Policy Iteration;
D O I
10.1023/A:1009635226865
中图分类号
学科分类号
摘要
We consider the approximate solution of discrete optimization problems using procedures that are capable of magnifying the effectiveness of any given heuristic algorithm through sequential application. In particular, we embed the problem within a dynamic programming framework, and we introduce several types of rollout algorithms, which are related to notions of policy iteration. We provide conditions guaranteeing that the rollout algorithm improves the performance of the original heuristic algorithm. The method is illustrated in the context of a machine maintenance and repair problem.
引用
收藏
页码:245 / 262
页数:17
相关论文
共 6 条
  • [1] Barto A.G., Bradtke S.J., Singh S.P., Learning to Act Using Real-Time Dynamic Programming, Artificial Intelligence, 72, pp. 81-138, (1995)
  • [2] Barto A.S., Sutton R., Reinforcement Learning, (1997)
  • [3] Bertsekas D.P., Tsitsiklis J.N., Neuro-Dynamic Programming, (1996)
  • [4] Glover F., Taillard E., De Werra D., A User's Guide to Tabu Search, Annals of Operations Research, 41, pp. 3-28, (1993)
  • [5] Pattipati K.R., Alexandridis M.G., Application of Heuristic Search and Information Theory to Sequential Fault Diagnosis, IEEE Transactions on Systems, Man, and Cybernetics, 20, pp. 872-887, (1990)
  • [6] Tesauro G., Galperin G.R., On-Line Policy Improvement Using Monte Carlo Search, (1996)