首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的多旋翼无人机空中目标自主跟踪
引用本文:杨兴昊,宋建梅,佘浩平,吴程杰,杨钦宁,付伟达. 基于深度强化学习的多旋翼无人机空中目标自主跟踪[J]. 计算机测量与控制, 2022, 30(10): 88-94
作者姓名:杨兴昊  宋建梅  佘浩平  吴程杰  杨钦宁  付伟达
作者单位:北京理工大学 宇航学院,,北京理工大学 宇航学院,,,
摘    要:针对空中对接任务中的目标自主跟踪问题,提出了一种基于深度强化学习的端到端的目标跟踪方法。该方法采用近端策略优化算法,Actor网络与Critic网络共享前两层的网络参数,将无人机所拍摄图像作为卷积神经网络的输入,通过策略网络控制多旋翼无人机电机转速,实现端到端的目标跟踪,同时采用shaping方法以加速智能体训练。通过物理引擎Pybullet搭建仿真环境并进行训练验证,仿真结果表明该方法能够达到设定的目标跟踪要求且具有较好的鲁棒性。

关 键 词:深度强化学习  近端策略优化  无人机  目标跟踪  端到端  
收稿时间:2022-03-08
修稿时间:2022-04-17

Air Target Autonomous Tracking of Multi-rotor UAV Based on Deep Reinforcement Learning
Abstract:Aiming at the problem of target autonomous tracking in the process of air docking, an end-to-end air target autonomous tracking method based on deep reinforcement learning is proposed. In this method, the near end strategy optimization algorithm is adopted. The strategy network and value network share the first two network parameters. The image captured by UAV is used as the input of convolution neural network. The motor speed of rotor UAV is controlled by strategy network to achieve end-to-end autonomous target tracking. At the same time, shaping meth-od is used to accelerate the agent training. The simulation environment is built by pybullet and the training verification is carried out. The experimental results show that the method can achieve the set target tracking requirements and has good robustness.
Keywords:Deep reinforcement learning   Proximal Policy Optimization   UAV   Target tracking   End-to-end
点击此处可从《计算机测量与控制》浏览原始摘要信息
点击此处可从《计算机测量与控制》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号