首页 | 本学科首页   官方微博 | 高级检索  
     

事件驱动的强化学习多智能体编队控制
引用本文:徐鹏1,谢广明1,2,3,文家燕1,2,高远1. 事件驱动的强化学习多智能体编队控制[J]. 智能系统学报, 2019, 14(1): 93-98. DOI: 10.11992/tis.201807010
作者姓名:徐鹏1  谢广明1  2  3  文家燕1  2  高远1
作者单位:1. 广西科技大学 电气与信息工程学院, 广西 柳州 545006;2. 北京大学 工学院, 北京 100871;3. 北京大学 海洋研究院, 北京 100871
摘    要:针对经典强化学习的多智能体编队存在通信和计算资源消耗大的问题,本文引入事件驱动控制机制,智能体的动作决策无须按固定周期进行,而依赖于事件驱动条件更新智能体动作。在设计事件驱动条件时,不仅考虑智能体的累积奖赏值,还引入智能体与邻居奖赏值的偏差,智能体间通过交互来寻求最优联合策略实现编队。数值仿真结果表明,基于事件驱动的强化学习多智能体编队控制算法,在保证系统性能的情况下,能有效降低多智能体的动作决策频率和资源消耗。

关 键 词:强化学习  多智能体  事件驱动  编队控制  马尔可夫过程  集群智能  动作决策  粒子群算法

Event-triggered reinforcement learning formation control for multi-agent
XU Peng1,XIE Guangming1,2,3,WEN Jiayan1,2,GAO Yuan1. Event-triggered reinforcement learning formation control for multi-agent[J]. CAAL Transactions on Intelligent Systems, 2019, 14(1): 93-98. DOI: 10.11992/tis.201807010
Authors:XU Peng1  XIE Guangming1  2  3  WEN Jiayan1  2  GAO Yuan1
Affiliation:1. School of Electric and Information Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China;2. College of Engineering, Peking University, Beijing 100871, China;3. Institute of Ocean Research, Peking University, Beijing 100871, China
Abstract:A large consumption of communication and computing capabilities has been reported in classical reinforcement learning of multi-agent formation. This paper introduces an event-triggered mechanism so that the multi-agent’s decisions do not need to be carried out periodically; instead, the multi-agent’s actions are replaced depending on the event-triggered condition. Both the sum of total reward and variance in current rewards are considered when designing an event-triggered condition, so a joint optimization strategy is obtained by exchanging information among multiple agents. Numerical simulation results demonstrate that the multi-agent formation control algorithm can effectively reduce the frequency of a multi-agent’s action decisions and consumption of resources while ensuring system performance.
Keywords:reinforcement learning   multi-agent   event-triggered   formation control   Markov decision processes   swarm intelligence   action-decisions   particle swarm optimization
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号