Constrained Reinforcement Learning for Dynamic Material Handling

Chengpeng HU, Ziming WANG, Jialin LIU, Junyi WEN, Bifei MAO, Xin YAO

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review


As one of the core parts of flexible manufacturing systems, material handling involves storage and transportation of materials between workstations with automated vehicles. The improvement in material handling can impulse the overall efficiency of the manufacturing system. However, the occurrence of dynamic events during the optimisation of task arrangements poses a challenge that requires adaptability and effectiveness. In this paper, we aim at the scheduling of automated guided vehicles for dynamic material handling. Motivated by some real-world scenarios, unknown new tasks and unexpected vehicle breakdowns are regarded as dynamic events in our problem. We formulate the problem as a constrained Markov decision process which takes into account tardiness and available vehicles as cumulative and instantaneous constraints, respectively. An adaptive constrained reinforcement learning algorithm that combines Lagrangian relaxation and invalid action masking, named RCPOM, is proposed to address the problem with two hybrid constraints. Moreover, a gym-like dynamic material handling simulator, named DMH-GYM, is developed and equipped with diverse problem instances, which can be used as benchmarks for dynamic material handling. Experimental results on the problem instances demonstrate the outstanding performance of our proposed approach compared with eight state-of-the-art constrained and non-constrained reinforcement learning algorithms, and widely used dispatching rules for material handling. © 2023 IEEE.
Original languageEnglish
Title of host publicationProceedings of the International Joint Conference on Neural Networks
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Print)9781665488679
Publication statusPublished - 18 Jun 2023
Externally publishedYes

Bibliographical note

This work was supported by the National Natural Science Foundation of China (Grant Nos. 62250710682, 61906083), the Guangdong Provincial Key Laboratory (Grant No. 2020B121201001), the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (Grant No. 2017ZT07X386), the Shenzhen Science and Technology Program (Grant No. KQTD2016112514355531), the Shenzhen Fundamental Research Program (Grant No. JCYJ20190809121403553), and the Research Institute of Trustworthy Autonomous Systems.


  • automated guided vehicle
  • benchmark
  • constrained reinforcement learning
  • Dynamic material handling
  • manufacturing system


Dive into the research topics of 'Constrained Reinforcement Learning for Dynamic Material Handling'. Together they form a unique fingerprint.

Cite this