基于深度强化学习与旋量法的机械臂路径规划EI北大核心CSCD Path planning of manipulator based on deep reinforcement learning and screw method期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习与旋量法的机械臂路径规划EI北大核心CSCD

引用本文：	王寅,王永华,尹泽中,万频.基于深度强化学习与旋量法的机械臂路径规划EI北大核心CSCD[J].控制理论与应用,2023,40(3):516-524.

作者姓名：	王寅王永华尹泽中万频

作者单位：	广东工业大学自动化学院,广东工业大学自动化学院,广东工业大学自动化学院,广东工业大学自动化学院

基金项目：	国家自然科学基金项目(61971147), 广东省研究生教育创新计划项目(2020JGXM040)资助.

摘要：	深度强化学习在机械臂路径规划的应用中仍面临样本需求量大和获取成本高的问题.针对这些问题,本文基于数据增强的思路,提出了深度强化学习与旋量法的融合算法.本算法通过旋量法将与环境交互所得的自然轨迹进行有效复制,使深度强化学习样本利用率和算法训练效率得到提高;复制轨迹的同时对被控物体、障碍物等环境元素进行同步复制,以此提高机械臂在非结构环境中的泛化性能.最后,在具备物理模拟引擎的Mujoco仿真平台中,通过Fetch机械臂和UR5机械臂在非结构化环境下进行实验对比分析,结果表明了本文算法对于提升深度强化学习样本利用率和机械臂模型泛化性能的可行性及有效性.
关键词：	强化学习机械臂旋量法数据增强
收稿时间：	2021/9/14 0:00:00
修稿时间：	2023/2/22 0:00:00
Path planning of manipulator based on deep reinforcement learning and screw method

WANG Yin,WANG Yong-hu,YIN Ze-zhong and WAN pin.Path planning of manipulator based on deep reinforcement learning and screw method[J].Control Theory & Applications,2023,40(3):516-524.

Authors:	WANG Yin WANG Yong-hu YIN Ze-zhong and WAN pin

Affiliation:	Faculty of Automation, Guangdong University of Technology,Guangdong University of Technology,Guangdong University of Technology,Guangdong University of Technology

Abstract:	The application of deep reinforcement learning in manipulator path planning still faces the problems of large sample demand and high acquisition cost. Aiming at these problems, a fusion algorithm of deep reinforcement learning and screw method based on the idea of data enhancement is proposed in this paper. In this algorithm, the natural trajectory from interaction with environment is effectively copied by the screw method, which improves the sample utilization of deep reinforcement learning and the training efficiency of the algorithm. Environmental elements such as the controlled objects and obstacles are synchronously copied while copying trajectories to improve the generalization performance of the robotic arm in non-structural environments. Finally, experimental comparisons are carried out by Fetch manipulator and UR5 manipulator in the unstructured environment in the Mujoco simulation platform with physical simulation engine. The results show that the proposed algorithm is feasible and effective to improve sample utilization of deep reinforcement learning and generalization performance of the manipulator model.

Keywords:	reinforcement learning manipulator screw method data enhancement
本文献已被维普等数据库收录！
	点击此处可从《控制理论与应用》浏览原始摘要信息
	点击此处可从《控制理论与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏