基于强化学习的参数自整定及优化算法 Parameter self-tuning and optimization algorithm based on reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化学习的参数自整定及优化算法

引用本文：	严家政,专祥涛,.基于强化学习的参数自整定及优化算法[J].智能系统学报,2022,17(2):341-347.

作者姓名：	严家政专祥涛

作者单位：	1. 武汉大学电气与自动化学院，湖北武汉 430072;2. 武汉大学深圳研究院，广东深圳 518057

摘要：	传统PID控制算法在非线性时滞系统的应用中，存在参数整定及性能优化过程繁琐、控制效果不理想的问题。针对该问题，提出了一种基于强化学习的控制器参数自整定及优化算法。该算法引入系统动态性能指标计算奖励函数，通过学习周期性阶跃响应的经验数据，无需辨识被控对象模型的具体数据，即可实现控制器参数的在线自整定及优化。以水箱液位控制系统为实验对象，对不同类型的PID控制器使用该算法进行参数整定及优化的对比实验。实验结果表明，相比于传统的参数整定方法，所提出的算法能省去繁琐的人工调参过程，有效优化控制器参数，减少被控量的超调量，提升控制器动态响应性能。
关键词：	强化学习整定优化学习算法时滞控制器液位控制动态响应
Parameter self-tuning and optimization algorithm based on reinforcement learning

YAN Jiazheng,ZHUAN Xiangtao,.Parameter self-tuning and optimization algorithm based on reinforcement learning[J].CAAL Transactions on Intelligent Systems,2022,17(2):341-347.

Authors:	YAN Jiazheng ZHUAN Xiangtao

Affiliation:	1. School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China;2. Shenzhen Research Institute, Wuhan University, Shenzhen 518057, China

Abstract:	To achieve better control performance in the nonlinear time-delay system, the traditional Proportional-Integral-Derivative (PID) control algorithm requires tuning and optimization, which complicates the controller design. First, we propose a new self-tuning and optimization algorithm for controller parameters based on reinforcement learning. Then, a reward function based on the system dynamic performance index is introduced by this algorithm. This function can learn the empirical data of periodic step response and realize the online optimization of controller parameters without identifying the model data of the controlled object. Finally, the algorithm is tested through experiments on a water tank level control system with different types of PID controllers. Experimental results show that, in contrast to the traditional parameter tuning method, the manual process is eliminated by the proposed algorithm, effectively optimizing the controller parameters, reducing the overshoot of the controlled quantity, and improving the dynamic response performance of the controller.

Keywords:	reinforcement learning tuning optimization learning algorithm time delay controller level control dynamic response

	点击此处可从《智能系统学报》浏览原始摘要信息
	点击此处可从《智能系统学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏