首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Action coordination in multiagent systemsis a difficult task especially in dynamicenvironments. If the environment possessescooperation, least communication,incompatibility and local informationconstraints, the task becomes even moredifficult. Learning compatible action sequencesto achieve a designated goal under theseconstraints is studied in this work. Two newmultiagent learning algorithms called QACE andNoCommQACE are developed. To improve theperformance of the QACE and NoCommQACEalgorithms four heuristics, stateiteration, means-ends analysis, decreasing reward and do-nothing, aredeveloped. The proposed algorithms are testedon the blocks world domain and the performanceresults are reported.  相似文献   

2.
Ho  F.  Kamel  M. 《Machine Learning》1998,33(2-3):155-177
A central issue in the design of cooperative multiagent systems is how to coordinate the behavior of the agents to meet the goals of the designer. Traditionally, this had been accomplished by hand-coding the coordination strategies. However, this task is complex due to the interactions that can take place among agents. Recent work in the area has focused on how strategies can be learned. Yet, many of these systems suffer from convergence, complexity and performance problems. This paper presents a new approach for learning multiagent coordination strategies that addresses these issues. The effectiveness of the technique is demonstrated using a synthetic domain and the predator and prey pursuit problem.  相似文献   

3.
Learning Situation-Specific Coordination in Cooperative Multi-agent Systems   总被引:1,自引:0,他引:1  
Achieving effective cooperation in a multi-agent system is a difficult problem for a number of reasons such as limited and possibly out-dated views of activities of other agents and uncertainty about the outcomes of interacting non-local tasks. In this paper, we present a learning system called COLLAGE, that endows the agents with the capability to learn how to choose the most appropriate coordination strategy from a set of available coordination strategies. COLLAGE relies on meta-level information about agents' problem solving situations to guide them towards a suitable choice for a coordination strategy. We present empirical results that strongly indicate the effectiveness of the learning algorithm.  相似文献   

4.
Learning to Take Actions   总被引:1,自引:0,他引:1  
Khardon  Roni 《Machine Learning》1999,35(1):57-90
We formalize a model for supervised learning of action strategies in dynamic stochastic domains and show that PAC-learning results on Occam algorithms hold in this model as well. We then identify a class of rule-based action strategies for which polynomial time learning is possible. The representation of strategies is a generalization of decision lists; strategies include rules with existentially quantified conditions, simple recursive predicates, and small internal state, but are syntactically restricted. We also study the learnability of hierarchically composed strategies where a subroutine already acquired can be used as a basic action in a higher level strategy. We prove some positive results in this setting, but also show that in some cases the hierarchical learning problem is computationally hard.  相似文献   

5.
Coordinating Multiple Agents via Reinforcement Learning   总被引:2,自引:0,他引:2  
In this paper, we attempt to use reinforcement learning techniques to solve agent coordination problems in task-oriented environments. The Fuzzy Subjective Task Structure model (FSTS) is presented to model the general agent coordination. We show that an agent coordination problem modeled in FSTS is a Decision-Theoretic Planning (DTP) problem, to which reinforcement learning can be applied. Two learning algorithms, coarse-grained and fine-grained, are proposed to address agents coordination behavior at two different levels. The coarse-grained algorithm operates at one level and tackle hard system constraints, and the fine-grained at another level and for soft constraints. We argue that it is important to explicitly model and explore coordination-specific (particularly system constraints) information, which underpins the two algorithms and attributes to the effectiveness of the algorithms. The algorithms are formally proved to converge and experimentally shown to be effective.  相似文献   

6.
7.
Cooperating and sharing resources by creating coalitions of agents are important ways for autonomous agents to execute tasks and to maximize payoff. Such coalitions will form only if each member of a coalition gains more by joining the coalition than it could gain otherwise. There are several ways of creating such coalitions and dividing the joint payoff among the members. In this paper we present algorithms for coalition formation and payoff distribution in nonsuperadditive environments. We focus on a low-complexity kernel-oriented coalition formation algorithm. The properties of this algorithm were examined via simulations. These have shown that the model increases the benefits of the agents within a reasonable time period, and more coalition formations provide more benefits to the agents.
  相似文献   

8.
Learning Communication Strategies in Multiagent Systems   总被引:3,自引:0,他引:3  
In this paper we describe a dynamic, adaptive communication strategy for multiagent systems. We discuss the behavioral parameters of each agent that need to be computed, and provide a quantitative solution to the problem of controlling these parameters. We also describe the testbed we built and the experiments we performed to evaluate the effectiveness of our methodology. Several experiments using varying populations and varying organizations of agents were performed and are reported. A number of performance measurements were collected as each experiment was performed so the effectiveness of the adaptive communications strategy could be measured quantitatively.The adaptive communications strategy proved effective for fully connected networks of agents. The performance of these experiments improved for larger populations of agents and even approached optimal performance levels. Experiments with non-fully connected networks showed that the adaptive communications strategy is extremely effective, but does not approach optimality. Other experiments investigated the ability of the adaptive communications strategy to compensate for distracting agents, for systems where agents are required to assume the role of information routers, and for systems that must decide between routing paths based on cost information.  相似文献   

9.
基于强化学习的多Agent协作研究   总被引:2,自引:0,他引:2  
强化学习为多Agent之间的协作提供了鲁棒的学习方法.本文首先介绍了强化学习的原理和组成要素,其次描述了多Agent马尔可夫决策过程MMDP,并给出了Agent强化学习模型.在此基础上,对多Agent协作过程中存在的两种强化学习方式:IL(独立学习)和JAL(联合动作学习)进行了比较.最后分析了在有多个最优策略存在的情况下,协作多Agent系统常用的几种协调机制.  相似文献   

10.
Conjectural Equilibrium in Multiagent Learning   总被引:2,自引:0,他引:2  
Wellman  Michael P.  Hu  Junling 《Machine Learning》1998,33(2-3):179-200
Learning in a multiagent environment is complicated by the fact that as other agents learn, the environment effectively changes. Moreover, other agents' actions are often not directly observable, and the actions taken by the learning agent can strongly bias which range of behaviors are encountered. We define the concept of a conjectural equilibrium, where all agents' expectations are realized, and each agent responds optimally to its expectations. We present a generic multiagent exchange situation, in which competitive behavior constitutes a conjectural equilibrium. We then introduce an agent that executes a more sophisticated strategic learning strategy, building a model of the response of other agents. We find that the system reliably converges to a conjectural equilibrium, but that the final result achieved is highly sensitive to initial belief. In essence, the strategic learner's actions tend to fulfill its expectations. Depending on the starting point, the agent may be better or worse off than had it not attempted to learn a model of the other agents at all.  相似文献   

11.
Cooperative Multi-Agent Learning: The State of the Art   总被引:1,自引:4,他引:1  
Cooperative multi-agent systems (MAS) are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to MAS problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning, RL or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including RL, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources.  相似文献   

12.
Dorigo  Marco 《Machine Learning》1995,19(3):209-240
In this article we investigate the feasibility of using learning classifier systems as a tool for building adaptive control systems for real robots. Their use on real robots imposes efficiency constraints which are addressed by three main tools: parallelism, distributed architecture, and training. Parallelism is useful to speed up computation and to increase the flexibility of the learning system design. Distributed architecture helps in making it possible to decompose the overall task into a set of simpler learning tasks. Finally, training provides guidance to the system while learning, shortening the number of cycles required to learn. These tools and the issues they raise are first studied in simulation, and then the experience gained with simulations is used to implement the learning system on the real robot. Results have shown that with this approach it is possible to let the AutonoMouse, a small real robot, learn to approach a light source under a number of different noise and lesion conditions.This work was partially written while the author was at International Computer Science Institute, 1947 Center Street, Suite 600, Berkeley, 94704-1198 California, USA.  相似文献   

13.
Organizational models have been recently used in agent theory for modeling coordination in open systems and to ensure social order in multi-agent system applications. In this paper, we propose the employment of Organization Theory for the analysis and design of multiagent systems. Thus, we first discuss the current state of the art of organization-oriented multiagent system methods, placing emphasis on their organizational features. We also review human organizational structures, and we propose several guidelines for implementing agent organizations by means of Organization Theory. Our final aim is to employ well-known human organizational structures to develop multiagent systems.  相似文献   

14.
    
Multiagent learning provides a promising paradigm to study how autonomous agents learn to achieve coordinated behavior in multiagent systems. In multiagent learning, the concurrency of multiple distributed learning processes makes the environment nonstationary for each individual learner. Developing an efficient learning approach to coordinate agents’ behavior in this dynamic environment is a difficult problem especially when agents do not know the domain structure and at the same time have only local observability of the environment. In this paper, a coordinated learning approach is proposed to enable agents to learn where and how to coordinate their behavior in loosely coupled multiagent systems where the sparse interactions of agents constrain coordination to some specific parts of the environment. In the proposed approach, an agent first collects statistical information to detect those states where coordination is most necessary by considering not only the potential contributions from all the domain states but also the direct causes of the miscoordination in a conflicting state. The agent then learns to coordinate its behavior with others through its local observability of the environment according to different scenarios of state transitions. To handle the uncertainties caused by agents’ local observability, an optimistic estimation mechanism is introduced to guide the learning process of the agents. Empirical studies show that the proposed approach can achieve a better performance by improving the average agent reward compared with an uncoordinated learning approach and by reducing the computational complexity significantly compared with a centralized learning approach. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

15.
针对目前神经网络在处理类似生物信息数据库这类较大规模数据时,遇到的大规模数据处理耗时过长、内存资源不足等问题.在分析当前神经网络分布式学习的基础上,提出了一种新的基于Agent和切片思想的分布式神经网络协同训练算法.通过对训练样本和训练过程的有效切分,整个样本集的学习被分配到一个分布式神经网络集群环境中进行协同训练,同时通过竞争筛选机制,使得学习性能较好的训练个体能有效地在神经网络群中迁移,以获得较多的资源进行学习.理论分析论证了该方法不仅能有效提高神经网络向目标解收敛的成功率,同时也具有较高的并行计算性能,以加快向目标解逼近的速度.最后,该方法被应用到了蛋白质二级结构预测这一生物信息学领域的问题上.结果显示,该分布式学习算法不仅能有效地处理大规模样本集的学习,同时也改进了训练得到的神经网络性能.  相似文献   

16.
从课件、协作学习、教学共同体、可视语言和远程教育等五个方面详细阐述了基于Web的教学模型及发展趋势,对课件开发技术和远程教学模式作了进一步探讨。  相似文献   

17.
支持分布式合作实时事务处理的协同检验点方法   总被引:1,自引:0,他引:1  
在实时事务执行时,事务故障或数据竞争会导致事务重启,为减少事务重启损失的工作量,可以采用检验点技术保证事务的时间正确性.在一类分布式实时数据库应用中,不同结点的事务通过消息交换形成合作关系,为保证合作事务间的全局一致性,当某一事务记检验点时,相关事务也要记检验点.传统协同检验点方法没有考虑应用的定时约束,不能很好地支持分布式合作实时事务处理.该文提出了一种基于图论的协同检验点方法,利用在每个计算结点上为每个合作事务集维护的局部有向图,使用一个基于图论的计算过程标识出应记检验点的事务,该方法既具有最小协同检验点特性,又使全局检验点的时延最小.实验表明该算法减少了全局检验点时延,有利于实时事务截止期的满足.  相似文献   

18.
Most of the straight-forward learning approaches in cooperative robotics imply for each learning robot a state space growth exponential in the number of team members. To remedy the exponentially large state space, we propose to investigate a less demanding cooperation mechanism—i.e., various levels of awareness—instead of communication. We define awareness as the perception of other robots locations and actions. We recognize four different levels (or degrees) of awareness which imply different amounts of additional information and therefore have different impacts on the search space size ((0), (1), (N), o(N),1 where N is the number of robots in the team). There are trivial arguments in favor of avoiding binding the increase of the search space size to the number of team members. We advocate that, by studying the maximum number of neighbor robots in the application context, it is possible to tune the parameters associated with a (1) increase of the search space size and allow good learning performance. We use the cooperative multi-robot observation of multiple moving targets (CMOMMT) application to illustrate our method. We verify that awareness allows cooperation, that cooperation shows better performance than a purely collective behavior and that learned cooperation shows better results than learned collective behavior.  相似文献   

19.
A Distributed Approach for Coordination of Traffic Signal Agents   总被引:1,自引:0,他引:1  
Innovative control strategies are needed to cope with the increasing urban traffic chaos. In most cases, the currently used strategies are based on a central traffic-responsive control system which can be demanding to implement and maintain. Therefore, a functional and spatial decentralization is desired. For this purpose, distributed artificial intelligence and multi-agent systems have come out with a series of techniques which allow coordination and cooperation. However, in many cases these are reached by means of communication and centrally controlled coordination processes, giving little room for decentralized management. Consequently, there is a lack of decision-support tools at managerial level (traffic control centers) capable of dealing with decentralized policies of control and actually profiting from them. In the present work a coordination concept is used, which overcomes some disadvantages of the existing methods. This concept makes use of techniques of evolutionary game theory: intersections in an arterial are modeled as individually-motivated agents or players taking part in a dynamic process in which not only their own local goals but also a global one has to be taken into account. The role of the traffic manager is facilitated since s/he has to deal only with tactical ones, leaving the operational issues to the agents. Thus the system ultimately provides support for the traffic manager to decide on traffic control policies. Some application in traffic scenarios are discussed in order to evaluate the feasibility of transferring the responsibility of traffic signal coordination to agents. The results show different performances of the decentralized coordination process in different scenarios (e.g. the flow of vehicles is nearly equal in both opposing directions, one direction has a clearly higher flow, etc.). Therefore, the task of the manager is facilitate once s/he recognizes the scenario and acts accordingly.This revised version was published online in August 2005 with a corrected cover date.  相似文献   

20.
One of the applications of service robots with a greater social impact is the assistance to elderly or disabled people. In these applications, assistant robots must robustly navigate in structured indoor environments such as hospitals, nursing homes or houses, heading from room to room to carry out different nursing or service tasks. Among the main requirements of these robotic aids, one that will determine its future commercial feasibility, is the easy installation of the robot in new working domains without long, tedious or complex configuration steps. This paper describes the navigation system of the assistant robot called SIRA, developed in the Electronics Department of the University of Alcalá, focusing on the learning module, specially designed to make the installation of the robot easier and faster in new environments. To cope with robustness and reliability requirements, the navigation system uses probabilistic reasoning (POMDPs) to globally localize the robot and to direct its goal-oriented actions. The proposed learning module fast learns the Markov model of a new environment by means of an exploration stage that takes advantage of human–robot interfaces (basically speech) and user–robot cooperation to accelerate model acquisition. The proposed learning method, based on a modification of the EM algorithm, is able to robustly explore new environments with a low number of corridor traversals, as shown in some experiments carried out with SIRA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号