首页 | 本学科首页   官方微博 | 高级检索  
     


Using process data to generate an optimal control policy via apprenticeship and reinforcement learning
Authors:Max Mowbray  Robin Smith  Ehecatl A. Del Rio-Chanona  Dongda Zhang
Affiliation:1. Department of Chemical Engineering and Analytical Science, The University of Manchester, Manchester, UK;2. Department of Chemical Engineering, Imperial College London, London, UK
Abstract:Reinforcement learning (RL) is a data-driven approach to synthesizing an optimal control policy. A barrier to wide implementation of RL-based controllers is its data-hungry nature during online training and its inability to extract useful information from human operator and historical process operation data. Here, we present a two-step framework to resolve this challenge. First, we employ apprenticeship learning via inverse RL to analyze historical process data for synchronous identification of a reward function and parameterization of the control policy. This is conducted offline. Second, the parameterization is improved online efficiently under the ongoing process via RL within only a few iterations. Significant advantages of this framework include to allow for the hot-start of RL algorithms for process optimal control, and robust abstraction of existing controllers and control knowledge from data. The framework is demonstrated on three case studies, showing its potential for chemical process control.
Keywords:apprenticeship learning  inverse reinforcement learning  machine learning  optimal control  reinforcement learning
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号