首页 | 本学科首页   官方微博 | 高级检索  
     

多智能体强化学习方法综述
引用本文:陈人龙,陈嘉礼,李善琦,谭营. 多智能体强化学习方法综述[J]. 信息对抗技术, 2024, 0(1): 18-32
作者姓名:陈人龙  陈嘉礼  李善琦  谭营
作者单位:北京大学机器感知与智能教育部重点实验室,北京 100871 ;北京大学智能学院,北京 100871;北京大学机器感知与智能教育部重点实验室,北京 100871 ;北京大学智能学院,北京 100871 ;北京大学人工智能研究院,北京 100871 ;北京大学跨媒体通用人工智能全国重点实验室,北京 100871
基金项目:国家重点研发计划项目(2018AAA0102301);国家自然科学基金资助项目(62250037, 62276008, 62076010)
摘    要:在自动驾驶、团队配合游戏等现实场景的序列决策问题中,多智能体强化学习表现出了优秀的潜力。然而,多智能体强化学习面临着维度灾难、不稳定性、多目标性和部分可观测性等挑战。为此,概述了多智能体强化学习的概念与方法,并整理了当前研究的主要趋势和研究方向。研究趋势包括CTDE范式、具有循环神经单元的智能体和训练技巧。主要研究方向涵盖混合型学习方法、协同与竞争学习、通信与知识共享、适应性与鲁棒性、分层与模块化学习、基于博弈论的方法以及可解释性。未来的研究方向包括解决维度灾难问题、求解大型组合优化问题和分析多智能体强化学习算法的全局收敛性。这些研究方向将推动多智能体强化学习在实际应用中取得更大的突破。

关 键 词:多智能体强化学习;强化学习;多智能体系统;群体协同;维度灾难
收稿时间:2023-02-22
修稿时间:2023-05-04

A survey of multi-agent reinforcement learning methods
CHEN Renlong,CHEN Jiali,LI Shanqi,TAN Ying. A survey of multi-agent reinforcement learning methods[J]. INFORMATION COUNTERMEASURE TECHNOLOGY, 2024, 0(1): 18-32
Authors:CHEN Renlong  CHEN Jiali  LI Shanqi  TAN Ying
Affiliation:Key Laboratory of Machine Perceptron (MOE), Peking University, Beijing 100871 , China ;School of Intelligence Science and Technology, Peking University, Beijing 100871 , China; Key Laboratory of Machine Perceptron (MOE), Peking University, Beijing 100871 , China ;School of Intelligence Science and Technology, Peking University, Beijing 100871 , China ;Institute for Artificial Intelligence, Peking University, Beijing 100871 , China ;National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871 , China
Abstract:In real-world scenarios such as autonomous driving and team-based cooperative games, multi-agent reinforcement learning has demonstrated significant potential in tackling sequential decision-making problems. However, it also encounters challenges including the curse of dimensionality, instability, multi-objectivity, and partial observability. This article offers an overview of the concepts and methods employed in multi-agent reinforcement learning, providing a summary of the prevailing trends and research directions in the current studies. The identified research trends comprise the CTDE paradigm, agents equipped with recurrent neural units, and various training techniques. The primary research directions encompass hybrid learning methods, cooperative and competitive learning, communication and knowledge sharing, adaptability and robustness, hierarchical and modular learning, game theoretic approaches, and interpretability. Looking ahead, future research directions entail addressing the curse of dimensionality, solving large-scale combinatorial optimization problems, and conducting analyses on the global convergence of multi-agent reinforcement learning algorithms. Pursuing these research directions will significantly contribute to further breakthroughs in the practical application of multi-agent reinforcement learning.
Keywords:multi-agent reinforcement learning; reinforcement learning; multi-agent system; swarm collaboration; curse dimensionality
点击此处可从《信息对抗技术》浏览原始摘要信息
点击此处可从《信息对抗技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号