期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Cooperative Multi-Agent Learning: The State of the Art 总被引：5，自引：4，他引：1

Liviu?Panait Email author Sean?Luke 《Autonomous Agents and Multi-Agent Systems》2005,11(3):387-434

Cooperative multi-agent systems (MAS) are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to MAS problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning, RL or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including RL, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources. 相似文献

2.

When Does Communication Learning Need Hierarchical Multi-Agent Deep Reinforcement Learning

Marie Ossenkopf Mackenzie Jorgensen Kurt Geihs 《控制论与系统》2019,50(8):672-692

Abstract

Multi-agent systems need to communicate to coordinate a shared task. We show that a recurrent neural network (RNN) can learn a communication protocol for coordination, even if the actions to coordinate are performed steps after the communication phase. We show that a separation of tasks with different temporal scale is necessary for successful learning. We contribute a hierarchical deep reinforcement learning model for multi-agent systems that separates the communication and coordination task from the action picking through a hierarchical policy. We further on show, that a separation of concerns in communication is beneficial but not necessary. As a testbed, we propose the Dungeon Lever Game and we extend the Differentiable Inter-Agent Learning (DIAL) framework. We present and compare results from different model variations on the Dungeon Lever Game. 相似文献

3.

分层强化学习中的并行自动分层方法研究

沈晶顾国昌刘海波《计算机工程与设计》2007,28(2):422-424

为加快分层强化学习中任务层次结构的自动生成速度,提出了一种基于多智能体系统的并行自动分层方法,该方法以Sutton提出的Option分层强化学习方法为理论框架,首先由多智能体合作对状态空间进行并行探测并集中聚类产生状态子空间,然后多智能体并行学习生成各子空间上内部策略,最终生成Option.以二维有障碍栅格空间内两点间最短路径规划为任务背景给出了算法并进行了仿真实验和分析,结果表明,并行自动分层方法生成任务层次结构的速度明显快于以往的串行自动分层方法.本文的方法适用于空间探测、路径规划、追逃等类问题领域. 相似文献

4.

COOPERATIVE LEARNING BY POLICY-SHARING IN MULTIPLE AGENTS

Kao-Shing Hwang Chia-Ju Lin Chia-Yue Lo 《控制论与系统》2013,44(4):286-309

Reinforcement learning is one of the more prominent machine-learning technologies due to its unsupervised learning structure and ability to continually learn, even in a dynamic operating environment. Applying this learning to cooperative multi-agent systems not only allows each individual agent to learn from its own experience, but also offers the opportunity for the individual agents to learn from the other agents in the system so the speed of learning can be accelerated. In the proposed learning algorithm, an agent adapts to comply with its peers by learning carefully when it obtains a positive reinforcement feedback signal, but should learn more aggressively if a negative reward follows the action just taken. These two properties are applied to develop the proposed cooperative learning method. This research presents the novel use of the fastest policy hill-climbing methods of Win or Lose Fast (WoLF) with policy-sharing. Results from the multi-agent cooperative domain illustrate that the proposed algorithms perform better than Q-learning alone in a piano mover environment. It also demonstrates that agents can learn to accomplish a task together efficiently through repetitive trials. 相似文献

5.

基于多智能体的Option自动生成算法 总被引：2，自引：0，他引：2

沈晶顾国昌刘海波《智能系统学报》2006,1(1):84-87

目前分层强化学习中的任务自动分层都是采用基于单智能体的串行学习算法，为解决串行算法学习速度较慢的问题，以Sutton的Option分层强化学习方法为基础框架，提出了一种基于多智能体的Option自动生成算法，该算法由多智能体合作对状态空间进行并行探测并集中应用aiNet实现免疫聚类产生状态子空间，然后并行学习生成各子空间上的内部策略，最终生成Option．以二维有障碍栅格空间内2点间最短路径规划为任务背景给出了算法并进行了仿真实验和分析．结果表明，基于多智能体的Option自动生成算法速度明显快于基于单智能体的算法。相似文献

6.

The Role of Fuzzy Awareness Modelling in Cooperative Management

Pradeep?Ray Email author Seyed?A.?Shahrestani Farhad?Daneshgar 《Information Systems Frontiers》2005,7(3):299-316

Cooperative management is concerned with the management of networks and services involving the cooperation of a number of human/organizational entities. One of the prerequisites for efficient management of these complex systems is related to understanding of the roles of humans and the ways hey interact with each other. Cooperative management Methodology for Enterprise Networks (CoMEN) achieves these objectives by defining an abstract measure of cooperation called Awareness level that is based on Computer Supported Cooperative Work (CSCW) concepts and techniques. In view of the abstract nature of the awareness level definitions, it is not clear how abstract awareness levels can be accurately translated into equivalent cooperative management design parameters. This paper explores the notion of fuzzy sets that enables the use of linguistic values for awareness levels. This is aimed at unveiling of the deficiencies in the existing collaborative support tools with a view to developing more effective cooperative applications. We also model the CSCW tools in terms of repositories and communication mechanisms using fuzzy notions with a view to arrive at a formal design methodology for cooperative management systems. The idea has been illustrated with a case study in a large telecom organization. 相似文献

7.

基于路径匹配的在线分层强化学习方法 总被引：1，自引：0，他引：1

石川史忠植王茂光《计算机研究与发展》2008,45(9)

如何在线找到正确的子目标是基于option的分层强化学习的关键问题.通过分析学习主体在子目标处的动作,发现了子目标的有效动作受限的特性,进而将寻找子目标的问题转化为寻找路径中最匹配的动作受限状态.针对网格学习环境,提出了单向值方法表示子目标的有效动作受限特性和基于此方法的option自动发现算法.实验表明,基于单向值方法产生的option能够显著加快Q学习算法,也进一步分析了option产生的时机和大小对Q学习算法性能的影响. 相似文献

8.

Learning collaboration strategies for committees of learning agents

Enric Plaza Santiago Ontañón 《Autonomous Agents and Multi-Agent Systems》2006,13(3):429-461

A main issue in cooperation in multi-agent systems is how an agent decides in which situations is better to cooperate with other agents, and with which agents does the agent cooperate. Specifically in this paper we focus on multi-agent systems composed of learning agents, where the goal of the agents is to achieve a high accuracy on predicting the correct solution of the problems they encounter. For that purpose, when encountering a new problem each agent has to decide whether to solve it individually or to ask other agents for collaboration. We will see that learning agents can collaborate forming committees in order to improve performance. Moreover, in this paper we will present a proactive learning approach that will allow the agents to learn when to convene a committee and with which agents to invite to join the committee. Our experiments show that learning results in smaller committees while maintaining (and sometimes improving) the problem solving accuracy than forming committees composed of all agents. 相似文献

9.

面向自主网格的多Agent动态协同图构造算法

下载免费PDF全文

王亮陈未如《计算机工程与应用》2009,45(9):127-130

网格环境的复杂性和动态性迫切需要自主计算技术的支持。在前期工作中给出地自主网格体系结构基础上,为解决任务执行过程中资源或服务失效情况下多Agent间协同的问题,提出了多Agent动态协同图的概念和任务偏序集驱动地多Agent动态协同图构造算法。图中的顶点是由Agent和自主网格服务组成的序偶,构造算法由任务偏序集到服务集的映射,逐层构造图中的顶点。该图通过Agent对本地服务的感知和Agent间的通信,达到任务执行过程中服务间自主协同的目标。模拟实验的结果验证了算法的正确性,表明算法的时间复杂度主要由任务哈斯图的层数决定,并且Agent的感知时间具有鲁棒性。相似文献

10.

A layered approach to learning coordination knowledge in multiagent environments

Guray Erus Faruk Polat 《Applied Intelligence》2007,27(3):249-267

Multiagent learning involves acquisition of cooperative behavior among intelligent agents in order to satisfy the joint goals. Reinforcement Learning (RL) is a promising unsupervised machine learning technique inspired from the earlier studies in animal learning. In this paper, we propose a new RL technique called the Two Level Reinforcement Learning with Communication (2LRL) method to provide cooperative action selection in a multiagent environment. In 2LRL, learning takes place in two hierarchical levels; in the first level agents learn to select their target and then they select the action directed to their target in the second level. The agents communicate their perception to their neighbors and use the communication information in their decision-making. We applied 2LRL method in a hunter-prey environment and observed a satisfactory cooperative behavior. Guray Erus received the B.S. degree in computer engineering in 1999, and the M.S. degree in cognitive sciences, in 2002, from Middle East Technical University (METU), Ankara, Turkey. He is currently a teaching and research assistant in Rene“ Descartes University, Paris, France, where he prepares a doctoral dissertation on object detection on satellite images, as a member of the intelligent perception systems group (SIP-CRIP5). His research interests include multi-agent systems and image understanding. Faruk Polat is a professor in the Department of Computer Engineering of Middle East Technical University, Ankara, Turkey. He received his B.Sc. in computer engineering from the Middle East Technical University, Ankara, in 1987 and his M.S. and Ph.D. degrees in computer engineering from Bilkent University, Ankara, in 1989 and 1993, respectively. He conducted research as a visiting NATO science scholar at Computer Science Department of University of Minnesota, Minneapolis in 1992–93. His research interests include artificial intelligence, multi-agent systems and object oriented data models. 相似文献