报告题目 | Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management: Reducing Costs and Alleviating Bullwhip Effect |
报告人(单位) | 彭一杰(北京大学) |
主持人(单位) | 李四杰、陈静(东南大学) |
时间地点 | 2022年10月21日(周五)下午1点 腾讯会议:998-196-697 |
报告摘要和内容: Problem definition: We apply Multi-Agent Deep Reinforcement Learning (MADRL) to inventory management problems with multiple echelons and evaluate MADRL’s performance for minimizing overall costs of a supply chain. We also examine whether the upfront-only information sharing mechanism used in MADRL helps alleviate the bullwhip effect in a supply chain. Methodology/results: We apply Heterogeneous-Agent Proximal Policy Optimization (HAPPO) on the multi-echelon inventory management problems in both serial supply chain and supply chain network. Our results show that policies constructed by HAPPO achieve lower overall costs than policies constructed by single-agent deep reinforcement learning and other heuristic policies. Also, application of HAPPO results in a less significant bullwhip effect than policies constructed by single-agent deep reinforcement learning where information is not shared among actors. Besides, when applying HAPPO, the system achieves the lowest overall costs when the minimization target for each actor is a combination of its own costs and the overall costs of the system. Managerial implications: Our results buttress the empirical finding that information sharing inside the supply chain helps alleviate the bullwhip effect even when decisions are not made by human beings but by policies constructed by MADRL. Also, our results show that a certain level of coordination among actors is essential for improving a supply chain’s overall performance. Neither actors being fully self-interested nor actors being fully system-focused leads to the optimal performance of the system. Our results verify MADRL’s potential in solving various multi-echelon inventory management problems with complex supply chain structures and non-stationary environments. | |
北京大学光华管理学院副教授,博士生导师。北京大学人工智能研究院、国家健康医疗大数据研究院兼职研究员。本科毕业于武汉大学数学与统计学院,从复旦大学管理学院获博士学位。在美国马里兰大学和乔治梅森大学分别从事过博士后与助理教授工作。主要研究方向包括仿真建模与优化、金融工程与风险管理、人工智能、健康医疗等。主持多项科研基金项目,包括国家优秀青年科学基金项目,国家青年科学基金项目,北京市青年骨干个人项目等。在《Operations Research》,《INFORMS Journal on Computing》和《IEEE Transactions on Automatic Control》等高质量期刊上发表学术论文30余篇。曾获得2019年INFORMS Outstanding Simulation Publication Award,2020年Winter Simulation Conference Best Theory Paper Finalist,2017年IEEE Robotics and Automatic Society Best Paper Award Finalist。目前担任Asia-Pacific Journal of Operational Research副主编、《系统管理学报》领域主编、IEEE Control Systems Society 会议编委,中国运筹学会金融工程与金融风险管理分会常务理事,中国仿真协会人工社会专委会委员,中国人工智能协会社会计算分会理事,中国管理现代化研究会风险管理专业委员会委员,北京运筹学会副秘书长。 |