首页 | 本学科首页   官方微博 | 高级检索  
     

基于多智能体强化学习的微装配任务规划方法
引用本文:徐兴辉,唐大林,顾书豪,左佳祺,王晓东,任同群.基于多智能体强化学习的微装配任务规划方法[J].计算机测量与控制,2023,31(8):217-223.
作者姓名:徐兴辉  唐大林  顾书豪  左佳祺  王晓东  任同群
作者单位:大连理工大学,,,,,大连理工大学机械工程学院
基金项目:国家重点研发计划资助项目(No. 2019YFB1310901);辽宁省“ 兴辽英才计划”资助项目(No.XLYC2002020);辽宁省自然科学(编号:2020-MS-104)
摘    要:现有装配任务规划方式多为人工规划,存在低效、高成本、易误操作等问题,为此分析了微装配操作的任务特点,以及对微装配中多操作臂协作与竞争关系进行了详细分析,并提出多智能体强化学习中符合微装配任务特点的动作空间、状态空间以及奖励函数的构建方法;利用CoppeliaSim仿真软件构建合理的仿真模型,对已有设备进行物理建模,构建了基于多智能体深度确定性策略梯度算法的学习模型并进行训练,在仿真环境中对设计的状态、动作空间以及奖励函数进行了逐项实验验证,最终获得了稳定的路径以及完整的任务实施方案;仿真结果表明,提出的环境构建方法,更契合直角坐标运动为主要框架的微装配任务,能够克服现有规划方法的不足,能够实现可实际工程化的多臂协同操作,提高任务的效率以及规划的自动化程度。

关 键 词:多智能体强化学习  奖励函数  微装配  任务规划  仿真环境构建
收稿时间:2023/3/9 0:00:00
修稿时间:2023/3/14 0:00:00

Microassembly Task Plaanning Method based on Multi-agent Reinforcement Learning
Abstract:The existing planning methods mostly are manual planning, which have problems such as inefficiency, high cost, and easy misoperation. Thus, the characteristics of microassembly operation tasks, collaboration and competition relationship of micro-assembly operation are analyzed in detail, and a method for the construction of action, state and reward conditions that conforms to the characteristics of micro-assembly tasks in multi-agent reinforcement learning is proposed. Using CoppeliaSim simulation software to model existing equipment physically, a learning model is built and trained based on multi-agent deep deterministic policy gradient algorithm, then the designed action, state and reward function are verified experimentally in simulation environment. Ultimately a stable path and complete task implementation scheme is obtained. The simulation results show that the proposed method is more suitable for the micro-assembly task with Cartesian coordinate motion as the main framework, and can overcome the shortcomings of existing planning methods. Besides, the method can realize the multi manipulator arm collaborative operation, which can be practically engineered and improve the efficiency of the task and the automation degree of planning.
Keywords:multi-agent reinforcement learning  reward function  micro-assembly  task planning  simulation environment construction
点击此处可从《计算机测量与控制》浏览原始摘要信息
点击此处可从《计算机测量与控制》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号