面向无人艇的T-DQN智能避障算法研究 Research on T-DQN Intelligent Obstacle Avoidance Algorithm of Unmanned Surface Vehicle期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向无人艇的T-DQN智能避障算法研究

引用本文：	周治国,余思雨,于家宝,段俊伟,陈龙,陈俊龙.面向无人艇的T-DQN智能避障算法研究[J].自动化学报,2023,49(8):1645-1655.

作者姓名：	周治国余思雨于家宝段俊伟陈龙陈俊龙

作者单位：	1.北京理工大学信息与电子学院北京 100081;;2.暨南大学信息科学技术学院广州 510532;;3.澳门大学科技学院澳门 999078;;4.华南理工大学计算机科学与工程学院广州 510006

基金项目：	“十三五”装备预研领域基金(61403120109)；;暨南大学中央高校基本科研业务费专项资金(21619412)资助~~；

摘要：	无人艇(Unmanned surface vehicle, USV)作为一种具有广泛应用前景的无人系统, 其自主决策能力尤为关键. 由于水面运动环境较为开阔, 传统避障决策算法难以在量化规则下自主规划最优路线, 而一般强化学习方法在大范围复杂环境下难以快速收敛. 针对这些问题, 提出一种基于阈值的深度Q网络避障算法(Threshold deep Q network, T-DQN), 在深度Q网络(Deep Q network, DQN)基础上增加长短期记忆网络(Long short-term memory, LSTM)来保存训练信息, 并设定经验回放池阈值加速算法的收敛. 通过在不同尺度的栅格环境中进行实验仿真, 实验结果表明, T-DQN算法能快速地收敛到最优路径, 其整体收敛步数相比Q-learning算法和DQN算法, 分别减少69.1%和24.8%, 引入的阈值筛选机制使整体收敛步数降低41.1%. 在Unity 3D强化学习仿真平台, 验证了复杂地图场景下的避障任务完成情况, 实验结果表明, 该算法能实现无人艇的精细化避障和智能安全行驶.
关键词：	无人艇强化学习智能避障深度Q网络
收稿时间：	2021-01-25
Research on T-DQN Intelligent Obstacle Avoidance Algorithm of Unmanned Surface Vehicle

ZHOU Zhi-Guo,YU Si-Yu,YU Jia-Bao,DUAN Jun-Wei,CHEN Long,CHEN Jun-Long.Research on T-DQN Intelligent Obstacle Avoidance Algorithm of Unmanned Surface Vehicle[J].Acta Automatica Sinica,2023,49(8):1645-1655.

Authors:	ZHOU Zhi-Guo YU Si-Yu YU Jia-Bao DUAN Jun-Wei CHEN Long CHEN Jun-Long

Affiliation:	1. School of Information and Electronics, Beijing Institute of Technology, Beijing 100081;;2. College of Information Science and Technology, Jinan University, Guangzhou 510532;;3. Faculty of Science and Technology, University of Macau, Macau 999078;;4. School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006

Abstract:	Unmanned surface vehicle (USV) is a kind of unmanned system with wide application prospect, and it is important to train the autonomous decision-making ability. Due to the wide water surface motion environment, traditional obstacle avoidance algorithms are difficult to independently plan a reasonable route under quantitative rules, while the general reinforcement learning methods are difficult to converge quickly in large and complex environment. To solve these problems, we propose a threshold deep Q network (T-DQN) algorithm, by adding long short-term memory (LSTM) network on basis of deep Q network (DQN), to save training information, and setting proper threshold value of experience replay pool to accelerate convergence. We conducted simulation experiments in different sizes grid, and the results show T-DQN method can converge to optimal path quickly, compared with the Q-learning and DQN, the number of convergence episodes is reduced by 69.1%, and 24.8%, respectively. The threshold mechanism reduces overall convergence steps by 41.1%. We also verified the algorithm in Unity 3D reinforcement learning simulation platform to investigate the completion of obstacle avoidance tasks under complex maps, the experiment results show that the algorithm can realize detailed obstacle avoidance and intelligent safe navigation.

Keywords:	Unmanned surface vehicle (USV) reinforcement learning intelligent obstacle avoidance deep Q network (DQN)

	点击此处可从《自动化学报》浏览原始摘要信息
	点击此处可从《自动化学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏