A reinforcement learning model for supply chain ordering management: An application to the beer game

被引:87
作者
Chaharsooghi, S. Kamal [1 ]
Heydari, Jafar [1 ]
Zegordi, S. Hessameddin [1 ]
机构
[1] Tarbiat Modares Univ, Sch Engn, Dept Ind Engn, Tehran, Iran
关键词
Supply chain; Ordering policy; Multi-agent systems; Beer game; Reinforcement learning;
D O I
10.1016/j.dss.2008.03.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A major challenge in supply chain ordering management is the coordination of ordering policies adopted by each level of the chain, so as to minimize inventory costs. This paper describes a new approach to decide on ordering policies of supply chain members in an integrated manner. In the first step supply chain ordering management has been considered as a multi-agent system and formulated as a reinforcement learning (RL) model. In the final step a Q-learning algorithm is proposed to solve the RL model. Results show that the reinforcement learning ordering mechanism (RLOM) is better than two other known algorithms. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:949 / 959
页数:11
相关论文
共 34 条
[1]  
Bellman R., 1957, DYNAMIC PROGRAMMING
[2]   An integrated production and inventory model to dampen upstream demand variability in the supply chain [J].
Boute, Robert N. ;
Disney, Stephen M. ;
Lambrecht, Marc R. ;
Van Houdt, Benny .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 178 (01) :121-142
[3]  
Christopher J.C.H, 1989, THESIS
[4]   Integrated production/distribution planning in supply chains [J].
Erengüç, SS ;
Vakharia, AJ .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1999, 115 (02) :217-218
[5]  
Forrester J. W., 1961, Industrial Dynamics Systems Dynamics Series
[6]   A fuzzy echelon approach for inventory management in supply chains [J].
Giannoccaro, I ;
Pontrandolfo, P ;
Scozzi, B .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2003, 149 (01) :185-196
[7]   Inventory management in supply chains: a reinforcement learning approach [J].
Giannoccaro, I ;
Pontrandolfo, P .
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2002, 78 (02) :153-161
[8]   Reinforcement learning for long-run average cost [J].
Gosavi, A .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2004, 155 (03) :654-674
[9]   Mid-term supply chain planning under demand uncertainty: customer demand satisfaction and inventory management [J].
Gupta, A ;
Maranas, CD ;
McDonald, CM .
COMPUTERS & CHEMICAL ENGINEERING, 2000, 24 (12) :2613-2621
[10]   Control of exploitation-exploration meta-parameter in reinforcement learning [J].
Ishii, S ;
Yoshida, W ;
Yoshimoto, J .
NEURAL NETWORKS, 2002, 15 (4-6) :665-687