Production Cell Operation Optimization by Reinforcement Learning

Authors

  • Bence Pálvölgyi Research Laboratory on Engineering and Management Intelligence of the HUN-REN Institute for Computer Science and Control, Center of Excellence of the Hungarian Academy of Sciences (MTA), Budapest, 1111, HUNGARY
  • Zsolt János Viharos Research Laboratory on Engineering and Management Intelligence of the HUN-REN Institute for Computer Science and Control, Center of Excellence of the Hungarian Academy of Sciences (MTA), Budapest, 1111, HUNGARY Faculty of Economics and Business of the John von Neumann University, Kecskemét, 6000, HUNGARY
  • Jenő Csanaki Opel Szentgotthárd Autóipari Kft., Szentgotthárd, 9970, HUNGARY
  • Krisztián Meskó Opel Szentgotthárd Autóipari Kft., Szentgotthárd, 9970, HUNGARY
  • Zsolt Nagy Opel Szentgotthárd Autóipari Kft., Szentgotthárd, 9970, HUNGARY

DOI:

https://doi.org/10.33927//hjic-2026-21

Keywords:

reinforcement learning, manufacturing operation optimization, action masking, production cell control

Abstract

Machine learning, particularly reinforcement learning, plays an increasing role in optimizing complex industrial processes. One such challenge arises in production systems, where products must be processed, often involving nontrivial scheduling and routing problems. The paper presents a reinforcement learning (RL)-based method to optimize a specific production cell, where two material-moving units and several machining units must cooperate to manufacture items that require both processing and occasional cleaning. The proposed methodology models the environment as a Markov Decision Process and employs RL algorithms to maximize throughput. Several popular RL algorithms were compared, and it was found that Maskable Proximal Policy Optimization (Maskable PPO) delivers the best performance, as agent-specific, valid and differentiated behavior is ensured for both material handling and machining units through action masking. Among the various masking strategies tested, a distinct masking approach proved to be the most effective.

Downloads

Published

2026-06-01

How to Cite

Production Cell Operation Optimization by Reinforcement Learning. (2026). Hungarian Journal of Industry and Chemistry, 54(SI), 87-94. https://doi.org/10.33927//hjic-2026-21