首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We exhibit an important property called the asymptotic equipartition property (AEP) on empirical sequences in an ergodic multiagent Markov decision process (MDP). Using the AEP which facilitates the analysis of multiagent learning, we give a statistical property of multiagent learning, such as reinforcement learning (RL), near the end of the learning process. We examine the effect of the conditions among the agents on the achievement of a cooperative policy in three different cases: blind, visible, and communicable. Also, we derive a bound on the speed with which the empirical sequence converges to the best sequence in probability, so that the multiagent learning yields the best cooperative result.  相似文献   

2.
This paper studies the cooperative control problem for a class of multiagent dynamical systems with partially unknown nonlinear system dynamics. In particular, the control objective is to solve the state consensus problem for multiagent systems based on the minimisation of certain cost functions for individual agents. Under the assumption that there exist admissible cooperative controls for such class of multiagent systems, the formulated problem is solved through finding the optimal cooperative control using the approximate dynamic programming and reinforcement learning approach. With the aid of neural network parameterisation and online adaptive learning, our method renders a practically implementable approximately adaptive neural cooperative control for multiagent systems. Specifically, based on the Bellman's principle of optimality, the Hamilton–Jacobi–Bellman (HJB) equation for multiagent systems is first derived. We then propose an approximately adaptive policy iteration algorithm for multiagent cooperative control based on neural network approximation of the value functions. The convergence of the proposed algorithm is rigorously proved using the contraction mapping method. The simulation results are included to validate the effectiveness of the proposed algorithm.  相似文献   

3.
There are numerous applications where a variety of human and software participants interactively pursue a given task (play a game, engage in a simulation, etc.). In this paper, we define a basic architecture for a distributed, interactive system (DIS for short). We then formally define a mathematical construct called a DIS abstraction that provides a theoretical basis for a software platform for building distributed interactive systems. Our framework provides a language for building multiagent applications where each agent has its own behaviors and where the behavior of the multiagent application as a whole is governed by one or more “master” agents. Agents in such a multiagent application may compete for resources, may attempt to take actions based on incorrect beliefs, may attempt to take actions that conflict with actions being concurrently attempted by other agents, and so on. Master agents mediate such conflicts. Our language for building agents (ordinary and master) depends critically on a notion called a “generalized constraint” that we define. All agents attempt to optimize an objective function while satisfying such generalized constraints that the agent is bound to preserve. We develop several algorithms to determine how an agent satisfies its generalized constraints in response to events in the multiagent application. We experimentally evaluate these algorithms in an attempt to understand their advantages and disadvantages. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

4.
In this paper, we propose a novel metric called MetrIntPair (Metric for Pairwise Intelligence Comparison of Agent‐Based Systems) for comparison of two cooperative multiagent systems problem‐solving intelligence. MetrIntPair is able to make an accurate comparison by taking into consideration the variability in intelligence in problem‐solving. The metric could treat the outlier intelligence indicators, intelligence measures that are statistically different from those others. For evaluation of the proposed metric, we realized a case study for two cooperative multiagent systems applied for solving a class of NP‐hard problems. The results of the case study proved that the small difference in the measured intelligence of the multiagent systems is the consequence of the variability. There is no statistical difference between the intelligence quotients/level of the multiagent systems. Both multiagent systems should be classified in the same intelligence class.  相似文献   

5.
The study of a multiagent system generating a new function was carried out. A coordinate solution, such as the multiagent system, is effective for complicated problems. Therefore, it is necessary for agents to behave systematically while building cooperative relations with each other. We pay attention to an entrainment phenomenon which can be seen in life phenomena, such as cardiac muscle cells or the emission of light by a firefly, as an element to promote organized behavior between agents. We suggest a new system model unlike conventional systems.  相似文献   

6.
Environment as a first class abstraction in multiagent systems   总被引:2,自引:1,他引:1  
The current practice in multiagent systems typically associates the environment with resources that are external to agents and their communication infrastructure. Advanced uses of the environment include infrastructures for indirect coordination, such as digital pheromones, or support for governed interaction in electronic institutions. Yet, in general, the notion of environment is not well defined. Functionalities of the environment are often dealt with implicitly or in an ad hoc manner. This is not only poor engineering practice, it also hinders engineers to exploit the full potential of the environment in multiagent systems. In this paper, we put forward the environment as an explicit part of multiagent systems.We give a definition stating that the environment in a multiagent system is a first-class abstraction with dual roles: (1) the environment provides the surrounding conditions for agents to exist, which implies that the environment is an essential part of every multiagent system, and (2) the environment provides an exploitable design abstraction for building multiagent system applications. We discuss the responsibilities of such an environment in multiagent systems and we present a reference model for the environment that can serve as a basis for environment engineering. To illustrate the power of the environment as a design abstraction, we show how the environment is successfully exploited in a real world application. Considering the environment as a first-class abstraction in multiagent systems opens up new horizons for research and development in multiagent systems.  相似文献   

7.
8.
In cooperative multiagent systems an alternative that maximizes the social welfare—the sum of utilities—can only be selected if each agent reports its full utility function. This may be infeasible in environments where communication is restricted. Employing a voting rule to choose an alternative greatly reduces the communication burden, but leads to a possible gap between the social welfare of the optimal alternative and the social welfare of the one that is ultimately elected. Procaccia and Rosenschein (2006) [13] have introduced the concept of distortion to quantify this gap.In this paper, we present the notion of embeddings into voting rules: functions that receive an agent?s utility function and return the agent?s vote. We establish that very low distortion can be obtained using randomized embeddings, especially when the number of agents is large compared to the number of alternatives. We investigate our ideas in the context of three prominent voting rules with low communication costs: Plurality, Approval, and Veto. Our results arguably provide a compelling reason for employing voting in cooperative multiagent systems.  相似文献   

9.
基于特定角色上下文的多智能体Q学习   总被引:1,自引:0,他引:1  
One of the main problems in cooperative multiagent learning is that the joint action space grows exponentially with the number of agents. In this paper, we investigate a sparse representation of the coordination dependencies between agents to employ roles and context-specific coordination graphs to reduce the joint action space. In our framework, the global joint Q-function is decomposed into a number of local Q-functions. Each local Q-function is shared among a small group of agents and is composed of a set of value rules. We propose a novel multiagent Q-learning algorithm which learns the weights in each value rule automatically. We give empirical evidence to show that our learning algorithm converges to the same optimal policy with a significantly faster speed than traditional multiagent learning techniques.  相似文献   

10.

This article describes Soccer Server, a simulator of the game of soccer designed as a benchmark for evaluating multiagent systems and cooperative algorithms. In real life, successful soccer teams require many qualities, such as basic ball control skills, the ability to carry out strategies, and teamwork. We believe that simulating such behaviors is a significant challenge for computer science, artificial intelligence, and robotics technologies. It is to promote the development of such technologies, and to help define a new standard problem for research, that we have developed Soccer Server. We demonstrate the potential of Soccer Server by reporting an experiment that uses the system to compare the performance of a neural network architecture and a decision tree algorithm at learning the selection of soccer play plans. Other researchers using Soccer Server to investigate the nature of cooperative behavior in a multiagent environment will have the chance to assess their progress at RoboCup-97, an international competition of robotic soccer to be held in conjunction with IJCAI-97. Soccer Server has been chosen as the official server for this contest.  相似文献   

11.
We report on a novel approach to modeling a dynamic domain with limited knowledge. A domain may include participating agents where we are uncertain about motivations and decision-making principles of some of these agents. Our reasoning setting for such domains includes deductive, inductive, and abductive components. The deductive component is based on situation calculus and describes the behavior of agents with complete information. The machine learning-based inductive and abductive components involve the previous experience with the agents, whose actions are uncertain to the system. Suggested reasoning machinery is applied to the problem of processing customer complaints in the form of textual messages that contain a multiagent conflict. The task is to predict the future actions of an opponent agent to determine the required course of action to resolve a multiagent conflict. This study demonstrates that the hybrid reasoning approach outperforms both stand-alone deductive and inductive components. Suggested methodology reflects the general situation of reasoning in dynamic domains in the conditions of uncertainty, merging analytical (rule-based) and analogy-based reasoning.  相似文献   

12.
In this paper, we study the cooperative robust output regulation problem for linear uncertain multiagent systems with both communication delay and input delay by the distributed internal model approach. The problem includes the leader‐following consensus problem of linear multiagent systems with time delay as a special case. We first generalize the internal model design method to systems with both communication delay and input delay. Then, under a set of standard assumptions, we have obtained the solution to the problem via both the state feedback control law and the output feedback control law. In contrast to the existing results, our results apply to general linear uncertain multiagent systems, accommodate a large class of leader signals, and achieve asymptotic tracking and disturbance rejection at the same time.  相似文献   

13.
To adapt linear discriminant analysis (LDA) to real-world applications, there is a pressing need to equip it with an incremental learning ability to integrate knowledge presented by one-pass data streams, a functionality to join multiple LDA models to make the knowledge sharing between independent learning agents more efficient, and a forgetting functionality to avoid reconstruction of the overall discriminant eigenspace caused by some irregular changes. To this end, we introduce two adaptive LDA learning methods: LDA merging and LDA splitting. These provide the benefits of ability of online learning with one-pass data streams, retained class separability identical to the batch learning method, high efficiency for knowledge sharing due to condensed knowledge representation by the eigenspace model, and more preferable time and storage costs than traditional approaches under common application conditions. These properties are validated by experiments on a benchmark face image data set. By a case study on the application of the proposed method to multiagent cooperative learning and system alternation of a face recognition system, we further clarified the adaptability of the proposed methods to complex dynamic learning tasks.  相似文献   

14.
The notion of environment is receiving an increasing attention in the development of multiagent applications. This is witnessed by the emergence of a number of infrastructures providing agent designers with useful means to develop the agent environment, and thus to structure an effective multiagent application. In this paper we analyse the role and features of such infrastructures, and survey some relevant examples. We endorse a general viewpoint where the environment of a multiagent system is seen as a set of basic bricks we call environment abstractions, which (i) provide agents with services useful for achieving individual and social goals, and (ii) are supported by some underlying software infrastructure managing their creation and exploitation. Accordingly, we focus the survey on the opportunities that environment infrastructures provide to system designers when developing multiagent applications.  相似文献   

15.
16.
Temporal-Difference-Fusion Architecture for Learning, Cognition, and Navigation (TD-FALCON) is a generalization of adaptive resonance theory (a class of self-organizing neural networks) that incorporates TD methods for real-time reinforcement learning. In this paper, we investigate how a team of TD-FALCON networks may cooperate to learn and function in a dynamic multiagent environment based on minefield navigation and a predator/prey pursuit tasks. Experiments on the navigation task demonstrate that TD-FALCON agent teams are able to adapt and function well in a multiagent environment without an explicit mechanism of collaboration. In comparison, traditional Q-learning agents using gradient-descent-based feedforward neural networks, trained with the standard backpropagation and the resilient-propagation (RPROP) algorithms, produce a significantly poorer level of performance. For the predator/prey pursuit task, we experiment with various cooperative strategies and find that a combination of a high-level compressed state representation and a hybrid reward function produces the best results. Using the same cooperative strategy, the TD-FALCON team also outperforms the RPROP-based reinforcement learners in terms of both task completion rate and learning efficiency.  相似文献   

17.
Organizational models have been recently used in agent theory for modeling coordination in open systems and to ensure social order in multi-agent system applications. In this paper, we propose the employment of Organization Theory for the analysis and design of multiagent systems. Thus, we first discuss the current state of the art of organization-oriented multiagent system methods, placing emphasis on their organizational features. We also review human organizational structures, and we propose several guidelines for implementing agent organizations by means of Organization Theory. Our final aim is to employ well-known human organizational structures to develop multiagent systems.  相似文献   

18.

Visualizing the behavior of systems with distributed data, control, and process is a notoriously difficult task. Each component in the distributed system has only a local view of the whole setup, and the onus is on the user to integrate, into a coherent whole, the large amounts of limited information they provide. In this article, we describe an architecture and an implemented system for visualizing and controlling distributed multiagent applications. The system comprises a suite of tools, with each tool providing a different perspective of the application being visualized . Each tool interrogates the components of the distributed application, collates the returned information, and presents this information to users in an appropriate manner. This in essence, shifts the burden ofinference from the user to the visualizer. Our visualizer has been evaluated on four distributed multiagent systems: a travel management application, a telecommunications network management application, a business process management demonstrator, and an electronic commerce application. Lastly, we briefly show how the suite of tools can be used together for debugging multiagent applications - a process we refer to as debugging via corroboration.  相似文献   

19.
同质团队学习是实现多Agent协作的一种方法。但是,传统方法仅在系统运行的前后对目标Agent进行修改,系统运行过程没有直接对Agent的改进做出贡献。本文利用合作策略,在学习分类器系统XCS的基础上提出了一种同质团队学习模型,弥补了传统方法的上述不足。文中还在模型的基础上实验分析了相关因素,如规则积累、通信以及发现新规则等对多Agent协作效率的影响。  相似文献   

20.
Bacteria, bees, and birds often work together in groups to find food. A group of mobile wheeled robots can be designed to coordinate their activities to achieve a goal. Networked cooperative uninhabited air vehicles (UAVs) are being developed for commercial and military applications. In order for such multiagent systems to succeed it is often critical that they can both maintain cohesive behaviors and appropriately respond to environmental stimuli. In this paper, we characterize cohesiveness of discrete-time multiagent systems as a boundedness or stability property of the agents' position trajectories and use a Lyapunov approach to develop conditions under which local agent actions will lead to cohesive group behaviors even in the presence of i) an interagent "sensing topology" that constrains information flow, where by "information flow," we mean the sensing of positions and velocities of agents, ii) a random but bounded delay and "noise" in sensing other agents' positions and velocities, and iii) noise in sensing a resource profile that represents an environmental stimulus and quantifies the goal of the multiagent system. Simulations are used to illustrate the ideas for multivehicle systems and to make connections to synchronization of coupled oscillators  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号