首页 | 本学科首页   官方微博 | 高级检索  
     

基于强化学习的大时延过程控制策略研究
引用本文:邓颢楠,刘树波,李丹,曹辉.基于强化学习的大时延过程控制策略研究[J].控制工程,2021,28(1):35-41.
作者姓名:邓颢楠  刘树波  李丹  曹辉
作者单位:武汉大学计算机学院,湖北武汉430072;湖北省水利水电科学研究院,湖北武汉430072
基金项目:国家科技重大专项课题(2017ZX07108-001)。
摘    要:具有长时延的过程控制被公认为是较难的系统过程控制。模型预测控制(MPC)是一种适用于大时延过程的新的过程控制方法。相比于PID等传统的控制方法,MPC基于模型对未来状态的预测进行决策,能够兼顾及时反馈与长期规划。但MPC对于过程的预测步数依然是有限的。强化学习作为机器学习的重要部分,原则上能够预测策略在无限长时间内的收益。作者基于强化学习方法改进混凝剂添加过程中的控制算法,利用大量仿真数据训练模型,成功提升了该过程的控制效果。通过对该方法进行仿真模拟,并与传统的MPC方法进行对比,证明了使用强化学习改进过的控制方法在大时延过程控制中的总体表现优于传统MPC方法。

关 键 词:模型预测控制  强化学习  大时延  过程控制  长时收益

Process Control Strategy with Long Delay Based on Reinforcement Learning
DENG Hao-nan,LIU Shu-bo,LI Dan,CAO Hui.Process Control Strategy with Long Delay Based on Reinforcement Learning[J].Control Engineering of China,2021,28(1):35-41.
Authors:DENG Hao-nan  LIU Shu-bo  LI Dan  CAO Hui
Affiliation:(School of Computer Science,Wuhan University,Wuhan 430072,China;Hubei Water Resources Research Institute,Wuhan 430072,China)
Abstract:Process control with long delay is known to be a difficult system process control problem.Model predictive control(MPC)is a new process control method suitable for long delay processes.Compared with traditional control methods like proportion-integral-derivative(PID)control,MPC makes decisions based on its prediction of the future state,which can balance timely feedback and long-term planning.However,predictive steps of MPC are usually limited.Reinforcement learning,as an important part of machine learning,can predict the reward of a strategy in an infinite period of time.We improved the process control method of adding coagulant based on the reinforcement learning,and used a large amount of simulation data to train the model,which successfully improved the control effect of the process.By simulation and comparing it with the traditional MPC method,it is proved that the control method improved by reinforcement learning is better than the traditional MPC method in this long delay control process.
Keywords:MPC  RL  long delay  process control  long-term-reward
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号