Distributed reinforcement learning control for batch sequencing and sizing in Just-In-Time manufacturing systems

被引:28
作者
Hong, JK [1 ]
Prabhu, VV [1 ]
机构
[1] Penn State Univ, Dept Ind & Mfg Engn, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
machine learning; scheduling; Just-In-Time production;
D O I
10.1023/B:APIN.0000011143.95085.74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an approach that is suitable for Just-In-Time (JIT) production for multi-objective scheduling problem in dynamically changing shop floor environment. The proposed distributed learning and control (DLC) approach integrates part-driven distributed arrival time control (DATC) and machine-driven distributed reinforcement learning based control. With DATC, part controllers adjust their associated parts' arrival time to minimize due-date deviation. Within the restricted pattern of arrivals, machine controllers are concurrently searching for optimal dispatching policies. The machine control problem is modeled as Semi Markov Decision Process ( SMDP) and solved using Q-learning. The DLC algorithms are evaluated using simulation for two types of manufacturing systems: family scheduling and dynamic batch sizing. Results show that DLC algorithms achieve significant performance improvement over usual dispatching rules in complex real-time shop floor control problems for JIT production.
引用
收藏
页码:71 / 87
页数:17
相关论文
共 38 条
[1]  
[Anonymous], IEEE T KNOWLEDGE DAT
[2]   Scheduling job families about an unrestricted common due date on a single machine [J].
Azizoglu, M ;
Webster, S .
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 1997, 35 (05) :1321-1330
[3]   MINIMIZING MEAN SQUARED DEVIATION OF COMPLETION TIMES ABOUT A COMMON DUE DATE [J].
BAGCHI, U ;
SULLIVAN, RS ;
CHANG, YL .
MANAGEMENT SCIENCE, 1987, 33 (07) :894-906
[4]   SEQUENCING WITH EARLINESS AND TARDINESS PENALTIES - A REVIEW [J].
BAKER, KR ;
SCUDDER, GD .
OPERATIONS RESEARCH, 1990, 38 (01) :22-36
[5]   Early/tardy scheduling with sequence dependent setups on uniform parallel machines [J].
Balakrishnan, N ;
Kanet, JJ ;
Sridharan, V .
COMPUTERS & OPERATIONS RESEARCH, 1999, 26 (02) :127-141
[6]  
BANAWAN SA, 1989, P 8 ANN JOINT C IEEE, V2, P731
[7]   LEARNING TO ACT USING REAL-TIME DYNAMIC-PROGRAMMING [J].
BARTO, AG ;
BRADTKE, SJ ;
SINGH, SP .
ARTIFICIAL INTELLIGENCE, 1995, 72 (1-2) :81-138
[8]   Inspection and maintenance planning: an application of semi-Markov decision processes [J].
Berenguer, C ;
Chu, CB ;
Grall, A .
JOURNAL OF INTELLIGENT MANUFACTURING, 1997, 8 (05) :467-476
[9]  
Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[10]   Multi-machine scheduling - A multi-agent learning approach [J].
Brauer, W ;
Weiss, G .
INTERNATIONAL CONFERENCE ON MULTI-AGENT SYSTEMS, PROCEEDINGS, 1998, :42-48