基于强化学习的浓密机底流浓度在线控制算法 Online Reinforcement Learning Control Algorithm for Concentration of Thickener Underflow期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化学习的浓密机底流浓度在线控制算法

引用本文：	袁兆麟,何润姿,姚超,李佳,班晓娟.基于强化学习的浓密机底流浓度在线控制算法[J].自动化学报,2021,47(7):1558-1571.

作者姓名：	袁兆麟何润姿姚超李佳班晓娟

作者单位：	1.北京科技大学计算机与通信工程学院北京 100083

基金项目：	海南省重点研发计划(ZDYF2019009), 国家重点基础研究发展计划(2019YFC0605300, 2016YFB0700500),国家自然科学基金(61572075, 61702036, 61873299)资助

摘要：	复杂过程工业控制一直是控制应用领域研究的前沿问题. 浓密机作为一种复杂大型工业设备广泛用于冶金、采矿等领域. 由于其在运行过程中具有多变量、非线性、高时滞等特点, 浓密机的底流浓度控制技术一直是学界、工业界的研究难点与热点. 本文提出了一种基于强化学习技术的浓密机在线控制算法. 该算法在传统启发式动态规划 (Heuristic dynamic programming, HDP)算法的基础上, 设计融合了评价网络与模型网络的双网结构, 并提出了基于短期经验回放的方法用于增强评价网络的训练准确性, 实现了对浓密机底流浓度的稳定控制, 并保持控制输入稳定在设定范围之内. 最后, 通过浓密机仿真实验的方式验证了算法的有效性, 实验结果表明本文提出的方法在时间消耗、控制精度上优于其他算法.
关键词：	自适应动态规划强化学习最优控制浓密机控制神经网络
收稿时间：	2019-05-10
Online Reinforcement Learning Control Algorithm for Concentration of Thickener Underflow

Affiliation:	1.School of Computer and Communication Engineering University of Science & Technology Beijing, Beijing 100083

Abstract:	Complex process industrial control is a widely concerned problem in the field of control application. As a kind of complex huge industrial equipment, thickener has been widely used in metallurgy, mining and other applications. Due to its characteristics of complicated variables, nonlinear and long delay in the operational process, the control strategy of underflow concentration for thickener has always been a hot and difficult issue in the academia and industry. This paper proposes a novel online control algorithm for thickener which is based on reinforcement learning. Inspired by the traditional heuristic dynamic programming (Heuristic dynamic programming, HDP) algorithm. The proposed method designs a double net framework which is composed of the critic network and the model network. To achieve the stabilization of underflow concentration, an optimal method which is based on reviewing the history data in a short term is proposed in the training phase of critic network. Simulation experiments verify efficiency of the proposed method. The results show that the proposed method can maintain the concentration of underflow in a stable horizon and performs better than other algorithms in accuracy and time consuming.

Keywords:

	点击此处可从《自动化学报》浏览原始摘要信息
	点击此处可从《自动化学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏