首页 | 本学科首页   官方微博 | 高级检索  
     

基于时空感知增强的深度Q网络无人水面艇局部路径规划
引用本文:张目,唐俊,杨友波,陈雨,雷印杰. 基于时空感知增强的深度Q网络无人水面艇局部路径规划[J]. 计算机应用研究, 2023, 40(5): 1330-1334
作者姓名:张目  唐俊  杨友波  陈雨  雷印杰
作者单位:四川大学电子信息学院,成都610065
基金项目:国家重点研发计划项目(2021YFC3300305)
摘    要:无人水面艇局部路径规划在海事救援、海洋运输等领域中发挥着重要的作用。现有局部路径规划算法在简单场景中取得了不错的效果,但面对环境中存在的复杂障碍物和海流干扰时,性能表现较差。为此,提出了一种基于时空感知增强的深度Q网络强化学习算法,首先,引入多尺度空间注意力模块捕捉距离传感器的多尺度空间信息,提升了复杂障碍物环境的感知能力;其次,利用基于长短时记忆网络的海流感知模块提取海流干扰环境的时间序列特征,增强了对海流干扰的感知能力;此外,对无人水面艇传感器和运动模型进行了模拟,并设计了强化学习状态空间、动作空间和基于方向导引的奖励函数,提升了算法的导航性能和收敛速度。在复杂仿真场景中进行了实验,结果表明,所提算法相比于原始算法在导航成功率和平均到达时间两个指标上均得到了提升,算法表现出较强的复杂环境适应性。

关 键 词:局部路径规划  复杂障碍物  海流干扰  深度Q网络  多尺度空间注意力  奖励函数
收稿时间:2022-09-16
修稿时间:2023-04-13

Local path planning for unmanned surface vehicle based on spatial and temporal sensing-enhanced deep Q-network
Zhang Mu,Tang Jun,Yang Youbo,Chen Yu and Lei Yinjie. Local path planning for unmanned surface vehicle based on spatial and temporal sensing-enhanced deep Q-network[J]. Application Research of Computers, 2023, 40(5): 1330-1334
Authors:Zhang Mu  Tang Jun  Yang Youbo  Chen Yu  Lei Yinjie
Affiliation:College of Electronics and Information Engineering, Sichuan University,,,,
Abstract:Local path planning for unmanned surface vehicle(USV) plays an important role in maritime rescue and marine transportation. Existing local path planning algorithms achieve good results in simple scenarios, but have poor performance when facing complex obstacles and sea current disturbances present in the environment. To this end, this paper proposed a reinforcement learning algorithm based on spatial and temporal sensing-enhanced deep Q-network. Firstly, it introduced a multiscale spatial attention module to capture the multiscale spatial information of distance sensors, which enhanced the perception capability of complex obstacle environments. Secondly, it used the LSTM-based current sensing module to extract the temporal sequence features of the current disturbance environment, which enhanced the perception capability of the current disturbance. In addition, by simulating the sensor and motion model of USV, it designed the reinforcement learning state space, action space and direction-guided reward function, this paper improved the navigation performance and convergence speed of the algorithm. Simulation experiments in complex scenarios show that the proposed algorithm improves both success rate and average arrival time metrics comparing to the original algorithm, and the algorithm shows strong adaptability to complex environment.
Keywords:local path planning   complex obstacle   current disturbance   deep Q-network   multi-scale spatial attention   reward function
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号