首页 | 本学科首页   官方微博 | 高级检索  
     

基于多智能体深度强化学习的协作导航应用
引用本文:马佩鑫,程钰,侯健,范庆来.基于多智能体深度强化学习的协作导航应用[J].计算机系统应用,2023,32(8):95-104.
作者姓名:马佩鑫  程钰  侯健  范庆来
作者单位:浙江理工大学 计算机科学与技术学院, 杭州 310018;浙江浙石油综合能源销售有限公司, 杭州 310012
基金项目:空间智能控制技术国防科技重点实验室2022年度国防科工局稳定支持科研项目(HTKJ2022KL502016)
摘    要:多机器人协作导航目前广泛应用于搜索救援、物流等领域, 协作策略与目标导航是多机器人协作导航面临的主要挑战. 为提高多个移动机器人在未知环境下的协作导航能力, 本文提出了一种新的分层控制协作导航(hierarchical control cooperative navigation, HCCN) 策略, 利用高层目标决策层和低层目标导航层, 为每个机器人分配一个目标点, 并通过全局路径规划和局部路径规划算法, 引导智能体无碰撞地到达分配的目标点. 通过Gazebo平台进行实验验证, 结果表明, 文中所提方法能够有效解决协作导航过程中的稀疏奖励问题, 训练速度至少可提高16.6%, 在不同环境场景下具有更好的鲁棒性, 以期为进一步研究多机器人协作导航提供理论指导, 应用至更多的真实场景中.

关 键 词:多机器人系统|协作导航|未知环境|多智能体深度强化学习|课程学习
收稿时间:2023/1/18 0:00:00
修稿时间:2023/2/23 0:00:00

Cooperative Navigation Application Based on Multi-agent Deep Reinforcement Learning
MA Pei-Xin,CHENG Yu,HOU Jian,FAN Qing-Lai.Cooperative Navigation Application Based on Multi-agent Deep Reinforcement Learning[J].Computer Systems& Applications,2023,32(8):95-104.
Authors:MA Pei-Xin  CHENG Yu  HOU Jian  FAN Qing-Lai
Affiliation:School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China; Zhejiang Petroleum Comprehensive Energy Sales Co. Ltd., Hangzhou 310012, China
Abstract:Multi-robot collaborative navigation is currently widely used in search and rescue, logistics, and other fields. Cooperative strategy and target navigation are the main challenges faced by multi-robot collaborative navigation. To improve the cooperative navigation ability of multiple mobile robots in an unknown environment, this study proposes a new hierarchical control cooperative navigation (HCCN) strategy. The high-level target decision layer and low-level target navigation layer are applied to assign a target point to each robot, and the global path planning and local path planning algorithms are adopted to guide the agent to reach the assigned target point without collision. Experimental verification is carried out on the Gazebo platform. The results show that the proposed method can effectively solve the sparse reward problem in cooperative navigation, and the training speed can be improved by at least 16.6%. It has better robustness in different scenarios. It is expected to provide theoretical guidance for further research on multi-robot cooperative navigation and be applied to more real scenarios.
Keywords:multi-robot systems|cooperative navigation|unknown environment|multi-agent deep reinforcement learning|curriculum learning
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号