首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的三维路径规划算法
引用本文:黄东晋,蒋晨凤,韩凯丽. 基于深度强化学习的三维路径规划算法[J]. 计算机工程与应用, 2020, 56(15): 30-36. DOI: 10.3778/j.issn.1002-8331.2001-0347
作者姓名:黄东晋  蒋晨凤  韩凯丽
作者单位:1.上海大学 上海电影学院,上海 2000722.上海电影特效工程技术研究中心,上海 200072
基金项目:上海市自然科学基金;国家自然科学基金;上海大学电影学高峰学科项目
摘    要:合理的路线选择是智能体三维路径规划研究领域的难点。现有路径规划方法存在不能很好地适应未知地形,避障形式单一等问题。针对这些问题,提出了一种基于LSTM-PPO的智能体三维路径规划算法。利用虚拟射线探测仿真环境,并将收集到的状态空间和动作状态引入长短时记忆网络。通过额外的奖惩函数和好奇心驱动让智能体学会跳跃通过低矮障碍物,避开大型障碍物。利用PPO算法的截断项机制使得规划策略更新的幅度更加优化。实验结果表明,该算法是可行的,能够更加智能合理地选择路线,很好地适应存在多样障碍物的未知环境。

关 键 词:深度强化学习  近端策略优化算法  路径规划  复杂未知场景  

3D Path Planning Algorithm Based on Deep Reinforcement Learning
HUANG Dongjin,JIANG Chenfeng,HAN Kaili. 3D Path Planning Algorithm Based on Deep Reinforcement Learning[J]. Computer Engineering and Applications, 2020, 56(15): 30-36. DOI: 10.3778/j.issn.1002-8331.2001-0347
Authors:HUANG Dongjin  JIANG Chenfeng  HAN Kaili
Affiliation:1.Shanghai Film Academy, Shanghai University, Shanghai 200072, China2.Shanghai Engineering Research Center of Motion Picture Special Effects, Shanghai 200072, China
Abstract:Reasonable path selection is a difficulty in the field of 3D path planning. The existing 3D path planning methods can not adapt to the unknown terrain, and the obstacle avoidance form is single. In order to solve these problems, a 3D path planning algorithm for agents based on LSTM-PPO is proposed. Virtual ray is designed to detect simulation environment, and the collected state space and action states are introduced into Long Short-Term Memory Networks(LSTM). Through the extra reward function and intrinsic curiosity module, the agent can learn to jump through low obstacles and avoid large obstacles. Using the PPO’s clipped surrogate objective to optimize the update range of planning strategy. The results show that the algorithm is feasible, more intelligent and more reasonable for path planning, and can adapt well to the unknown environment with many obstacles.
Keywords:deep reinforcement learning  Proximal Policy Optimization(PPO) algorithm  path planning  complex unknown environment  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号