Learning policies for single machine job dispatching

被引：6

作者：

Wang, YC

Usher, JM ^{[1
]}

机构：

[1] Kun Shan Univ Technol, Dept Informat Management, Tainan 710, Taiwan

[2] Mississippi State Univ, Dept Ind Engn, Mississippi State, MS 39762 USA

来源：

ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING | 2004年 / 20卷 / 06期

关键词：

reinforcement learning; Q-learning algorithm; dispatching rule selection;

D O I：

10.1016/j.rcim.2004.07.003

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Reinforcement learning (RL) has received some attention in recent years from a,gent-based researchers because it deals with the problem of how an autonomous agent can learn to select proper actions for achieving its goals through interacting with its environment. Each time after an agent performs an action, the environment's response, as indicated by its new state, is used by the agent to reward or penalize its action. The agent's goal is to maximize the total amount of reward it receives over the long run. Although there have been several successful examples demonstrating the usefulness of RL, its application to manufacturing systems has not been fully explored. In this study, a single machine agent employs the Q-learning algorithm to develop a decision-making policy on selecting the appropriate dispatching rule from among three given dispatching rules. The system objective is to minimize mean tardiness. This paper presents a factorial experiment design for studying the settings used to apply Q-learning to the single machine dispatching rule selection problem. The factors considered in this study include two related to the agent's policy table design and three for developing its reward function. This study not only investigates the main effects of this Q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of Q-learning to agent-based production scheduling. (C) 2004 Elsevier Ltd. All rights reserved.

引用

页码：553 / 562

页数：10

共 20 条

[1]

[Anonymous], 2000, MULTIAGENT SYSTEMS C

[2]

[Anonymous], 1999, Reinforcement learning: An introduction

[3] Dynamic job-shop scheduling using reinforcement learning agents [J].

Aydin, ME ;

Öztemel, E .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2000, 33 (2-3) :169-178

[4]

BRENNER Z, 1998, INTELLIGENT SOFTWARE

[5]

Crites RH, 1996, ADV NEUR IN, V8, P1017

[6] Solving semi-Markov decision problems using average reward reinforcement learning [J].

Das, TK ;

Gosavi, A ;

Mahadevan, S ;

Marchalleck, N .

MANAGEMENT SCIENCE, 1999, 45 (04) :560-574

[7]

Mahadevan S, 1996, AI MAG, V17, P89

[8]

MAHADEVAN S, 1997, AAAI FALL S MOD DIR

[9]

MAHADEVAN S, 1997, P 14 INT C MACH LEAR, P202

[10]

MAHADEVAN S, 1998, 11 INT FLAIRS C, P372

← 1 2 →