基于强化迭代学习的四旋翼无人机轨迹控制 Trajectory control of quadrotor based on reinforcement learning-iterative learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化迭代学习的四旋翼无人机轨迹控制

引用本文：	刘旭光,杜昌平,郑耀.基于强化迭代学习的四旋翼无人机轨迹控制[J].计算机应用,2022,42(12):3950-3956.

作者姓名：	刘旭光杜昌平郑耀

作者单位：	浙江大学航空航天学院，杭州 310027

摘要：	为进一步提升在未知环境下四旋翼无人机轨迹的跟踪精度，提出了一种在传统反馈控制架构上增加迭代学习前馈控制器的控制方法。针对迭代学习控制（ILC）中存在的学习参数整定困难的问题，提出了一种利用强化学习（RL）对迭代学习控制器的学习参数进行整定优化的方法。首先，利用RL对迭代学习控制器的学习参数进行优化，筛选出当前环境及任务下最优的学习参数以保证迭代学习控制器的控制效果最优；其次，利用迭代学习控制器的学习能力不断迭代优化前馈输入，直至实现完美跟踪；最后，在有随机噪声存在的仿真环境中把所提出的强化迭代学习控制（RL-ILC）算法与未经参数优化的ILC方法、滑模变结构控制（SMC）方法以及比例-积分-微分（PID）控制方法进行对比实验。实验结果表明，所提算法在经过2次迭代后，总误差缩减为初始误差的0.2%，实现了快速收敛；并且与SMC控制方法及PID控制方法相比，RL-ILC算法在算法收敛后不会受噪声影响产生轨迹波动。由此可见，所提算法能够有效提高无人机轨迹跟踪的准确性和鲁棒性。
关键词：	迭代学习控制强化学习四旋翼无人机参数整定轨迹跟踪
收稿时间：	2021-10-26
修稿时间：	2021-12-15
Trajectory control of quadrotor based on reinforcement learning-iterative learning

Xuguang LIU,Changping DU,Yao ZHENG.Trajectory control of quadrotor based on reinforcement learning-iterative learning[J].journal of Computer Applications,2022,42(12):3950-3956.

Authors:	Xuguang LIU Changping DU Yao ZHENG

Affiliation:	School of Aeronautics and Astronautics，Zhejiang University，Hangzhou Zhejiang 310027，China

Abstract:	In order to further improve the trajectory tracking accuracy of quadrotor in unknown environment， a control method adding an iterative learning feedforward controller to the traditional feedback control architecture was proposed. Facing the difficulty of tuning learning parameters in the process of Iterative Learning Control （ILC）， a method of tuning and optimizing learning parameters of iterative learning controllers using Reinforcement Learning （RL） was proposed. Firstly， RL was used to optimize the learning parameters of iterative learning controller， and the optimal learning parameters under the current environment and tasks were filtered out to ensure the optimal control effect of the iterative learning controller. Then， with the learning ability of iterative learning controllers， the feedforward input was optimized iteratively until the perfect tracking was achieved. Finally， in the simulation environment with random noise， experiments were carried out to compare the proposed Reinforcement Learning-Iterative Learning Control （RL-ILC） algorithm with ILC method without optimizing parameters， Sliding Mode Control （SMC） method and Proportional-Integral-Derivative （PID） control method. Experimental results show that after two iterations， the proposed algorithm has the total error reduced to 0.2% of the initial error， achieving rapid convergence. Compared with SMC method and PID control method， RL-ILC algorithm is not affected by noise and does not produce trajectory fluctuations after algorithm convergence. The results illustrate that the proposed algorithm can effectively improve the trajectory tracking task’s accuracy and robustness.

Keywords:	Iterative Learning Control (ILC) reinforcement learning quadrotor parameter tuning trajectory tracking

	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏