基于多层注意力机制—柔性AC算法的机器人路径规划 Robot path planning based on soft AC algorithm for multilayer attention mechanisms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多层注意力机制—柔性AC算法的机器人路径规划

引用本文：	韩金亮,任海菁,吴淞玮,蒋欣欣,刘凤凯.基于多层注意力机制—柔性AC算法的机器人路径规划[J].计算机应用研究,2020,37(12):3650-3655.

作者姓名：	韩金亮任海菁吴淞玮蒋欣欣刘凤凯

作者单位：	中国矿业大学数学学院,江苏徐州221116;中国矿业大学环境与测绘学院,江苏徐州221116;中国矿业大学安全工程学院,江苏徐州221116;中国矿业大学信息与控制工程学院,江苏徐州221116

基金项目：	国家自然科学基金;创新训练项目

摘要：	针对行动者—评论家（AC）算法存在的经验学习样本维度高、策略梯度模型鲁棒性低等问题，依据多代理系统的信息协作优势，构建注意力机制网络并作为代理体，引入多层并行注意力机制网络模型对AC算法进行改进，提出一种基于多层并行注意力机制的柔性AC算法。将其用于解决动态未知环境下的机器人路径规划问题，可增强行动者的策略梯度鲁棒性并降低评论家的回归误差，实现机器人路径规划最优方案的快速收敛。实验结果表明，该算法有效克服机器人路径规划的局部最优，具有计算速度快、稳定收敛的优点。
关键词：	行动者—评论家算法注意力机制深度强化学习机器人路径规划
收稿时间：	2019/9/3 0:00:00
修稿时间：	2020/11/3 0:00:00
Robot path planning based on soft AC algorithm for multilayer attention mechanisms

Han Jinliang,Ren Haijing,Wu Songwei,Jiang Xinxin and Liu Fengkai.Robot path planning based on soft AC algorithm for multilayer attention mechanisms[J].Application Research of Computers,2020,37(12):3650-3655.

Authors:	Han Jinliang Ren Haijing Wu Songwei Jiang Xinxin and Liu Fengkai

Affiliation:	School of mathematics, China University of Mining and Technology,,,,

Abstract:	Aiming at the high dimensionality of the empirical learning sample and the low robustness of the strategy gradient model in the actor-critic(AC) algorithm, this paper constructed the attention mechanism network and acted as a proxy based on the information cooperation advantages of the multi-agent systems, introducing a multi-layer parallel attention mechanism. By adding the network model and the soft function to the actor-critic algorithm, this paper proposed a soft actor-critic algorithm based on multi-layer parallel attention mechanism to solve the problem of robot path planning, enhance the actors'' strategy gradientrobustness and reduce regression error of the critics, and achieved the fast convergence of robot path planning. The experimental results show that this method can effectively overcome the local optimization problem of robot path planning, and has the advantages of fast computation speed and stable convergence.

Keywords:	actor-critic algorithm attention mechanisms network deep reinforcement learning robot path planning
本文献已被万方数据等数据库收录！
	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏