首页 | 本学科首页   官方微博 | 高级检索  
     

多智能体深度强化学习的若干关键科学问题
引用本文:孙长银,穆朝絮.多智能体深度强化学习的若干关键科学问题[J].自动化学报,2020,46(7):1301-1312.
作者姓名:孙长银  穆朝絮
作者单位:1.东南大学自动化学院 南京 210096
基金项目:科技部人工智能专项重大项目 (2018AAA0101400), 国家自然科学基金创新研究群体(61921004), 国家自然科学基金(61942301)资助
摘    要:强化学习作为一种用于解决无模型序列决策问题的方法已经有数十年的历史, 但强化学习方法在处理高维变量问题时常常会面临巨大挑战. 近年来, 深度学习迅猛发展, 使得强化学习方法为复杂高维的多智能体系统提供优化的决策策略、在充满挑战的环境中高效执行目标任务成为可能. 本文综述了强化学习和深度强化学习方法的原理, 提出学习系统的闭环控制框架, 分析了多智能体深度强化学习中存在的若干重要问题和解决方法, 包括多智能体强化学习的算法结构、环境非静态和部分可观性等问题, 对所调查方法的优缺点和相关应用进行分析和讨论. 最后提供多智能体深度强化学习未来的研究方向, 为开发更强大、更易应用的多智能体强化学习控制系统提供一些思路.

关 键 词:强化学习    深度强化学习    多智能体    学习系统    智能控制    决策优化
收稿时间:2020-03-25

Important Scientific Problems of Multi-Agent Deep Reinforcement Learning
Affiliation:1.School of Automation, Southeast University, Nanjing 2100962.School of Electrical and Information Engineering, Tianjin University, Tianjin 300072
Abstract:Reinforcement learning has been used to solve sequence decision problems without models for decades. However, it often faces great challenges in dealing with high-dimensional problems. In recent years, with the rapid development of deep learning, it promotes that reinforcement learning can provide the optimized strategy for complex and high-dimensional multi-agent systems to efficiently perform the target tasks in challenging environments. This paper reviews on the principles of reinforcement learning and deep reinforcement learning, puts forward the closed-loop control framework of learning systems, and investigates the existing important problems and corresponding methods for the deep reinforcement learning of multi-agent systems, including multi-agent reinforcement learning algorithmic framework, non-static environment, partially observability, and so on. The merits and drawbacks of these investigated methods are analyzed, and some related applications are summarized. This paper also provides some new insights into various research directions of multi-agent reinforcement learning, and related ideas for better application development in the future.
Keywords:
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号