Production Cell Operation Optimization by Reinforcement Learning

Bence Pálvölgyi; Zsolt János Viharos; Jenő Csanaki; Krisztián Meskó; Zsolt Nagy

doi:10.33927//hjic-2026-21

Authors

Bence Pálvölgyi Research Laboratory on Engineering and Management Intelligence of the HUN-REN Institute for Computer Science and Control, Center of Excellence of the Hungarian Academy of Sciences (MTA), Budapest, 1111, HUNGARY
Zsolt János Viharos Research Laboratory on Engineering and Management Intelligence of the HUN-REN Institute for Computer Science and Control, Center of Excellence of the Hungarian Academy of Sciences (MTA), Budapest, 1111, HUNGARY Faculty of Economics and Business of the John von Neumann University, Kecskemét, 6000, HUNGARY
Jenő Csanaki Opel Szentgotthárd Autóipari Kft., Szentgotthárd, 9970, HUNGARY
Krisztián Meskó Opel Szentgotthárd Autóipari Kft., Szentgotthárd, 9970, HUNGARY
Zsolt Nagy Opel Szentgotthárd Autóipari Kft., Szentgotthárd, 9970, HUNGARY

DOI:

https://doi.org/10.33927//hjic-2026-21

Keywords:

reinforcement learning, manufacturing operation optimization, action masking, production cell control

Abstract

Machine learning, particularly reinforcement learning, plays an increasing role in optimizing complex industrial processes. One such challenge arises in production systems, where products must be processed, often involving nontrivial scheduling and routing problems. The paper presents a reinforcement learning (RL)-based method to optimize a specific production cell, where two material-moving units and several machining units must cooperate to manufacture items that require both processing and occasional cleaning. The proposed methodology models the environment as a Markov Decision Process and employs RL algorithms to maximize throughput. Several popular RL algorithms were compared, and it was found that Maskable Proximal Policy Optimization (Maskable PPO) delivers the best performance, as agent-specific, valid and differentiated behavior is ensured for both material handling and machining units through action masking. Among the various masking strategies tested, a distinct masking approach proved to be the most effective.

Production Cell Operation Optimization by Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

pdf_archive

Information

Browse