Restless bandits, linear programming relaxations, and a primal-dual index heuristic

被引:134
作者
Bertsimas, D [1 ]
Niño-Mora, J
机构
[1] MIT, Alfred P Sloan Sch Management, Cambridge, MA 02139 USA
[2] MIT, Ctr Operat Res, Cambridge, MA 02139 USA
[3] Univ Pompeu Fabra, Dept Econ & Business, E-08005 Barcelona, Spain
关键词
D O I
10.1287/opre.48.1.80.12444
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We develop a mathematical programming approach for the classical PSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy of N (where N is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the first-order relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal, Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.
引用
收藏
页码:80 / 90
页数:11
相关论文
共 16 条
[1]   Conservation laws, extended polymatroids and multiarmed bandit problems; A polyhedral approach to indexable systems [J].
Bertsimas, D ;
Nino-Mora, J .
MATHEMATICS OF OPERATIONS RESEARCH, 1996, 21 (02) :257-306
[2]   OPTIMIZATION OF MULTICLASS QUEUEING NETWORKS: POLYHEDRAL AND NONLINEAR CHARACTERIZATIONS OF ACHIEVABLE PERFORMANCE [J].
Bertsimas, Dimitris ;
Paschalidis, Ioannis Ch. ;
Tsitsiklis, John N. .
ANNALS OF APPLIED PROBABILITY, 1994, 4 (01) :43-75
[3]   A CHARACTERIZATION OF WAITING TIME PERFORMANCE REALIZABLE BY SINGLE-SERVER QUEUES [J].
COFFMAN, EG ;
MITRANI, I .
OPERATIONS RESEARCH, 1980, 28 (03) :810-821
[4]   A PROBABILISTIC PRODUCTION AND INVENTORY PROBLEM [J].
DEPENOUX, F .
MANAGEMENT SCIENCE, 1963, 10 (01) :98-108
[5]   CHARACTERIZATION AND OPTIMIZATION OF ACHIEVABLE PERFORMANCE IN GENERAL QUEUING-SYSTEMS [J].
FEDERGRUEN, A ;
GROENEVELT, H .
OPERATIONS RESEARCH, 1988, 36 (05) :733-741
[6]  
Gittins J.C., 1989, MULTIARMED BANDIT AL
[7]  
GITTINS JC, 1974, PROGR STATISTICS, V1, P241
[8]   CONES OF MATRICES AND SET-FUNCTIONS AND 0-1 OPTIMIZATION [J].
Lovasz, L. ;
Schrijver, A. .
SIAM JOURNAL ON OPTIMIZATION, 1991, 1 (02) :166-190
[9]  
Murty KG, 1983, LINEAR PROGRAMMING
[10]   The complexity of optimal queuing network control [J].
Papadimitriou, CH ;
Tsitsiklis, JN .
MATHEMATICS OF OPERATIONS RESEARCH, 1999, 24 (02) :293-305