Affiliation: | 1. School of Chemical and Biological Engineering, Institute of Chemical Processes, Seoul National University, Seoul, Republic of Korea;2. School of Chemical and Biological Engineering, Institute of Chemical Processes, Seoul National University, Seoul, Republic of Korea Contribution: Data curation (supporting), Software (equal), Visualization (equal), Writing - review & editing (supporting);3. Bioprocess Engineering, Technische Universität Berlin, Berlin, Germany Contribution: Conceptualization (equal), Methodology (equal), Resources (equal), Software (equal), Validation (supporting) |
Abstract: | As the digital transformation of the bioprocess is progressing, several studies propose to apply data-based methods to obtain a substrate feeding strategy that minimizes the operating cost of a semi-batch bioreactor. However, the negligent application of model-free reinforcement learning (RL) has a high chance to fail on improving the existing control policy because the available amount of data is limited. In this article, we propose an integrated algorithm of double-deep Q-network and model predictive control. The proposed method learns the action-value function in an off-policy fashion and solves the model-based optimal control problem where the terminal cost is assigned by the action-value function. For simulation study, the proposed method, model-based method, and model-free methods are applied to the industrial scale penicillin process. The results show that the proposed method outperforms other methods, and it can learn with fewer data than model-free RL algorithms. |