首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
Reinforcement learning (RL) is a data-driven approach to synthesizing an optimal control policy. A barrier to wide implementation of RL-based controllers is its data-hungry nature during online training and its inability to extract useful information from human operator and historical process operation data. Here, we present a two-step framework to resolve this challenge. First, we employ apprenticeship learning via inverse RL to analyze historical process data for synchronous identification of a reward function and parameterization of the control policy. This is conducted offline. Second, the parameterization is improved online efficiently under the ongoing process via RL within only a few iterations. Significant advantages of this framework include to allow for the hot-start of RL algorithms for process optimal control, and robust abstraction of existing controllers and control knowledge from data. The framework is demonstrated on three case studies, showing its potential for chemical process control.  相似文献   

2.
Advanced model-based controllers are well established in process industries. However, such controllers require regular maintenance to maintain acceptable performance. It is a common practice to monitor controller performance continuously and to initiate a remedial model re-identification procedure in the event of performance degradation. Such procedures are typically complicated and resource intensive, and they often cause costly interruptions to normal operations. In this article, we exploit recent developments in reinforcement learning and deep learning to develop a novel adaptive, model-free controller for general discrete-time processes. The deep reinforcement learning (DRL) controller we propose is a data-based controller that learns the control policy in real time by merely interacting with the process. The effectiveness and benefits of the DRL controller are demonstrated through many simulations.  相似文献   

3.
As the digital transformation of the bioprocess is progressing, several studies propose to apply data-based methods to obtain a substrate feeding strategy that minimizes the operating cost of a semi-batch bioreactor. However, the negligent application of model-free reinforcement learning (RL) has a high chance to fail on improving the existing control policy because the available amount of data is limited. In this article, we propose an integrated algorithm of double-deep Q-network and model predictive control. The proposed method learns the action-value function in an off-policy fashion and solves the model-based optimal control problem where the terminal cost is assigned by the action-value function. For simulation study, the proposed method, model-based method, and model-free methods are applied to the industrial scale penicillin process. The results show that the proposed method outperforms other methods, and it can learn with fewer data than model-free RL algorithms.  相似文献   

4.
This work develops a transfer learning (TL) framework for modeling and predictive control of nonlinear systems using recurrent neural networks (RNNs) with the knowledge obtained in modeling one process transferred to another. Specifically, transfer learning uses a pretrained model developed based on a source domain as the starting point, and adapts the model to a target process with similar configurations. The generalization error for TL-based RNN (TL-RNN) is first derived to demonstrate the generalization capability on the target process. The theoretical error bound that depends on model capacity and the discrepancy between source and target domains is then utilized to guide the development of pretrained models for improved model transferability. Subsequently, the TL-RNN model is utilized as the prediction model in model predictive controller (MPC) for the target process. Finally, the simulation study of chemical reactors via Aspen Plus Dynamics is used to demonstrate the benefits of transfer learning.  相似文献   

5.
We propose a new reinforcement learning approach for nonlinear optimal control where the value function is updated as restricted to control Lyapunov function (CLF) and the policy is improved using a variation of Sontag's formula. The practical asymptotic stability of the closed-loop system is guaranteed during the training and at the end of training without requiring an additional actor network and its update rule. For a single-layer neural network (NN) with exact basis functions, the approximate function converges to the optimal value function, resulting in the optimal controller. When a deep NN is used, the level set shapes of the trained NN become similar to those of the optimal value function. Because Sontag's formula with CLF is equivalent to the optimal controller when the given CLF has the same level set shapes as the optimal value function, Sontag's formula with the trained NN provides a nearly optimal controller.  相似文献   

6.
一种间歇过程产品质量迭代学习控制策略   总被引:5,自引:3,他引:5       下载免费PDF全文
贾立  施继平  邱铭森 《化工学报》2009,60(8):2017-2023
针对基于迭代学习控制的间歇过程产品质量优化控制算法难以进行收敛性分析的难题,以数据驱动的神经模糊模型为基础,提出一种新颖间歇过程的产品质量迭代学习控制方法。通过在优化算法中加入了新的约束条件,改变了最优解的搜索空间范围,从而使产品质量在批次轴上收敛,并创新性地对优化问题的收敛性给出了严格的数学证明。在理论研究的基础上,将提出的算法用于间歇连续反应釜的终点质量控制研究,仿真结果验证了本文算法的有效性和实用价值,为间歇过程的优化控制提供了一条新途径。  相似文献   

7.
Safety is a critical factor in reinforcement learning (RL) in chemical processes. In our previous work, we had proposed a new stability-guaranteed RL for unconstrained nonlinear control-affine systems. In the approximate policy iteration algorithm, a Lyapunov neural network (LNN) was updated while being restricted to the control Lyapunov function, and a policy was updated using a variation of Sontag's formula. In this study, we additionally consider state and input constraints by introducing a barrier function, and we extend the applicable type to general nonlinear systems. We augment the constraints into the objective function and use the LNN added with a Lyapunov barrier function to approximate the augmented value function. Sontag's formula input with this approximate function brings the states into its lower level set, thereby guaranteeing the constraints satisfaction and stability. We prove the practical asymptotic stability and forward invariance. The effectiveness is validated using four tank system simulations.  相似文献   

8.
Automated flowsheet synthesis is an important field in computer-aided process engineering.The present work demonstrates how reinforcement learning can be used for automated flowsheet synthesis without any heuristics or prior knowledge of conceptual design.The environment consists of a steady-state flowsheet simulator that contains all physical knowledge.An agent is trained to take discrete actions and sequentially build up flowsheets that solve a given process problem.A novel method named SynGameZero is developed to ensure good exploration schemes in the complex problem.Therein,flowsheet synthesis is modelled as a game of two competing players.The agent plays this game against itself during training and consists of an artificial neural network and a tree search for forward planning.The method is applied successfully to a reaction-distillation process in a quaternary system.  相似文献   

9.
10.
针对具有输入时滞的多阶段间歇过程,考虑执行器故障影响,提出了无穷时域优化混杂容错控制器设计方法。该方法首先将给定具有输入时滞的模型转化为新的无时滞的状态空间模型,接着再将此模型转换为包含状态变量误差和输出跟踪误差的扩展状态空间模型,并用切换系统模型表示,然后引入有限时域的二次目标函数,利用最优控制理论,设计出在无穷时域中容错控制器。为获得最小运行时间,针对不同阶段设计依赖于Lyapunov函数的驻留时间方法。创新之处在于,控制律设计简单,计算量小,且每一阶段时间求取不需要引用任何其他变量,简单易行。最后,以注塑成型过程为例,仿真结果证明所提出方法具有可行性和有效性。  相似文献   

11.
基于T-S模糊模型的间歇过程的迭代学习容错控制   总被引:3,自引:1,他引:2       下载免费PDF全文
间歇过程不仅具有强非线性,同时还会受到诸如执行器等故障影响,研究非线性间歇过程在具有故障的情况下依然稳定运行至关重要。针对执行器增益故障及系统所具有的强非线性,提出一种新的基于间歇过程的T-S模糊模型的复合迭代学习容错控制方法。首先根据间歇过程的非线性模型,利用扇区非线性方法建立其T-S模糊故障模型,再利用间歇过程的二维特性与重复特性,在2D系统理论框架内,设计2D复合ILC容错控制器,进而构建此T-S模糊模型的等价二维Rosser模型,接着利用Lyapunov方法给出系统稳定充分条件并求解控制器增益。针对强非线性的连续搅拌釜进行仿真,结果表明所提出方法具有可行性与有效性。  相似文献   

12.
基于输入轨迹参数化的间歇过程迭代学习控制   总被引:3,自引:3,他引:0       下载免费PDF全文
针对间歇过程的迭代学习控制问题,提出了一种基于输入轨迹参数化的迭代学习控制策略。根据最优输入轨迹的主要形态特征,将其参数化为较少量的决策变量,降低传统迭代学习控制复杂性的同时维持良好的优化控制效果。基于输入轨迹参数化的迭代学习控制策略能保持算法的简洁性和易实现性,在不确定扰动影响下逐步改善产品质量。对一个间歇反应器的仿真研究验证了本文方法的有效性。  相似文献   

13.
14.
Process synthesis experiences a disruptive transformation accelerated by artificial intelligence. We propose a reinforcement learning algorithm for chemical process design based on a state-of-the-art actor-critic logic. Our proposed algorithm represents chemical processes as graphs and uses graph convolutional neural networks to learn from process graphs. In particular, the graph neural networks are implemented within the agent architecture to process the states and make decisions. We implement a hierarchical and hybrid decision-making process to generate flowsheets, where unit operations are placed iteratively as discrete decisions and corresponding design variables are selected as continuous decisions. We demonstrate the potential of our method to design economically viable flowsheets in an illustrative case study comprising equilibrium reactions, azeotropic separation, and recycles. The results show quick learning in discrete, continuous, and hybrid action spaces. The method is predestined to include large action-state spaces and an interface to process simulators in future research.  相似文献   

15.
Closed‐loop stability of nonlinear systems under real‐time Lyapunov‐based economic model predictive control (LEMPC) with potentially unknown and time‐varying computational delay is considered. To address guaranteed closed‐loop stability (in the sense of boundedness of the closed‐loop state in a compact state‐space set), an implementation strategy is proposed which features a triggered evaluation of the LEMPC optimization problem to compute an input trajectory over a finite‐time prediction horizon in advance. At each sampling period, stability conditions must be satisfied for the precomputed LEMPC control action to be applied to the closed‐loop system. If the stability conditions are not satisfied, a backup explicit stabilizing controller is applied over the sampling period. Closed‐loop stability under the real‐time LEMPC strategy is analyzed and specific stability conditions are derived. The real‐time LEMPC scheme is applied to a chemical process network example to demonstrate closed‐loop stability and closed‐loop economic performance improvement over that achieved for operation at the economically optimal steady state. © 2014 American Institute of Chemical Engineers AIChE J, 61: 555–571, 2015  相似文献   

16.
In order to address two-dimensional (2D) control issue for a class of batch chemical processes, we propose a novel high-order iterative learning model predictive control (HILMPC) method in this paper. A set of local state-space models are first constructed to represent the batch chemical processes by adopting the just-in-time learning (JITL) technique. Meanwhile, a pre-clustered strategy is used to lessen the computational burden of the modelling process and improve the modelling efficiency. Then, a two-stage 2D controller is designed to achieve integrated control by combining high-order iterative learning control (HILC) on the batch domain with model predictive control (MPC) on the time domain. The resulting HILMPC controller can not only guarantee the convergence of the system on the batch domain, but also guarantee the closed-loop stability of the system on the time domain. The convergence of the HILMPC method is ensured by rigorous analysis. Two examples are presented in the end to demonstrate that the developed method provides better control performance than its previous counterpart.  相似文献   

17.
18.
Assessing the quality of industrial control loops is an important auditing task for the control engineer. However, there are complications when considering the ubiquitous nonlinearities present in many industrial control loops. If one simply ignores these nonlinearities, there is the danger of over‐estimating the performance of the control loop in rejecting disturbances and thereby possibly overlooking loops that need attention. To address this problem, several techniques have been recently developed to extend the control performance assessment (CPA) of single input/single output linear systems to nonlinear systems. This article surveys these nonlinear CPA techniques and compares their performances using three case studies. These results can be used to guide control engineers to select the most suitable CPA techniques for their particular applications. © 2012 Canadian Society for Chemical Engineering  相似文献   

19.
过程系统工程的发展和面临的挑战   总被引:4,自引:0,他引:4  
过程系统工程是一门蓬勃发展中的重要学科。对这门学科的发展沿革做了简略回顾,然后对这门学科所做出的贡献和差距、存在的问题进行了探讨,最后指出过程系统工程在21世纪所面临的挑战及发展机遇。  相似文献   

20.
洪英东  熊智华  江永亨  叶昊 《化工学报》2017,68(7):2826-2832
针对间歇过程点对点跟踪控制问题,在轨迹更新的迭代学习控制算法框架下,针对非理想初始状态情况下3种不同的初始误差,通过2D Roesser模型对其进行描述并分析其收敛性。给出了不同的情况下系统相对参考轨迹的零误差跟踪或者收敛到特定邻域的条件,在零误差跟踪不能实现的情况下给出了邻域的范围。通过数值模型仿真验证了给出的收敛条件和收敛边界,并分析了不同因素对收敛边界的影响。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号