首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Multiagent learning provides a promising paradigm to study how autonomous agents learn to achieve coordinated behavior in multiagent systems. In multiagent learning, the concurrency of multiple distributed learning processes makes the environment nonstationary for each individual learner. Developing an efficient learning approach to coordinate agents’ behavior in this dynamic environment is a difficult problem especially when agents do not know the domain structure and at the same time have only local observability of the environment. In this paper, a coordinated learning approach is proposed to enable agents to learn where and how to coordinate their behavior in loosely coupled multiagent systems where the sparse interactions of agents constrain coordination to some specific parts of the environment. In the proposed approach, an agent first collects statistical information to detect those states where coordination is most necessary by considering not only the potential contributions from all the domain states but also the direct causes of the miscoordination in a conflicting state. The agent then learns to coordinate its behavior with others through its local observability of the environment according to different scenarios of state transitions. To handle the uncertainties caused by agents’ local observability, an optimistic estimation mechanism is introduced to guide the learning process of the agents. Empirical studies show that the proposed approach can achieve a better performance by improving the average agent reward compared with an uncoordinated learning approach and by reducing the computational complexity significantly compared with a centralized learning approach. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
In this paper, we study the resource allocation problem of second‐order multiagent systems with exogenous disturbances, and the communication networks are weight‐balanced digraphs. Different from the well‐studied resource allocation problems, our problem involves the disturbed second‐order dynamics of agents. In order to achieve the optimal allocation, we propose a distributed algorithm based on gradient descent and internal model approach. Furthermore, we analyze the convergence of the algorithm by constructing a suitable Lyapunov function. Moreover, we prove that the agents in the network can achieve the exact optimal allocation even in the presence of external disturbances. Finally, we provide two examples to illustrate our result.  相似文献   

3.
Discovering unknown adverse drug reactions (ADRs) in postmarketing surveillance as early as possible is highly desirable. Nevertheless, current postmarketing surveillance methods largely rely on spontaneous reports that suffer from serious underreporting, latency, and inconsistent reporting. Thus these methods are not ideal for rapidly identifying rare ADRs. The multiagent systems paradigm is an emerging and effective approach to tackling distributed problems, especially when data sources and knowledge are geographically located in different places and coordination and collaboration are necessary for decision making. In this article, we propose an active, multiagent framework for early detection of ADRs by utilizing electronic patient data distributed across many different sources and locations. In this framework, intelligent agents assist a team of experts based on the well‐known human decision‐making model called Recognition‐Primed Decision (RPD). We generalize the RPD model to a fuzzy RPD model and utilize fuzzy logic technology to not only represent, interpret, and compute imprecise and subjective cues that are commonly encountered in the ADR problem but also to retrieve prior experiences by evaluating the extent of matching between the current situation and a past experience. We describe our preliminary multiagent system design and illustrate its potential benefits for assisting expert teams in early detection of previously unknown ADRs. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 827–845, 2007.  相似文献   

4.
5.
This paper focuses on the extension of the transferable belief model (TBM) to a multiagent-distributed context where no central aggregation unit is available and the information can be exchanged only locally among agents. In this framework, agents are assumed to be independent reliable sources which collect data and collaborate to reach a common knowledge about an event of interest. Two different scenarios are considered: In the first one, agents are supposed to provide observations which do not change over time (static scenario), while in the second one agents are assumed to dynamically gather data over time (dynamic scenario). A protocol for distributed data aggregation, which is proved to converge to the basic belief assignment given by an equivalent centralized aggregation schema based on the TBM, is provided. Since multiagent systems represent an ideal abstraction of actual networks of mobile robots or sensor nodes, which are envisioned to perform the most various kind of tasks, we believe that the proposed protocol paves the way to the application of the TBM in important engineering fields such as multirobot systems or sensor networks, where the distributed collaboration among players is a critical and yet crucial aspect.  相似文献   

6.
This work presents a multi‐agent system for knowledge‐based high‐level event composition, which interprets activities, behaviour and situations semantically in a scenario with multi‐sensory monitoring. A perception agent (plurisensory agent and visual agent)‐based structure is presented. The agents process the sensor information and identify (agent decision system) significant changes in the monitored signals, which they send as simple events to the composition agent that searches for and identifies pre‐defined patterns as higher‐level semantic composed events. The structure has a methodology and a set of tools that facilitate its development and application to different fields without having to start from scratch. This creates an environment to develop knowledge‐based systems generally for event composition. The application task of our work is surveillance, and event composition/inference examples are shown which characterize an alarming situation in the scene and resolve identification and tracking problems of people in the scenario being monitored.  相似文献   

7.
Autonomous agents and multiagent systems have been successfully applied to a number of problems and have been largely used in different application fields. In particular, in this paper we are interested in information retrieval. In fact, in this field multiagent solutions are very useful and effective since they decouple the problem in a network of software agents that interact to solve problems that are beyond the individual capabilities or knowledge. In so doing, multiagent systems allow to overwhelm typical problems of single agent and centralized approaches. To discuss the lesson learnt in using the multiagent technology in the field of information retrieval, in this paper, we present our experience in using X.MAS, a generic multiagent architecture aimed at retrieving, filtering and reorganizing information according to user interests. To this end, after presenting X.MAS, we illustrate six applications built upon it. Our conclusion is that multiagent technology is quite effective to design and realize concrete information retrieval applications.  相似文献   

8.
In this article we discuss the problem of inferring threats in an urban environment, where the knowledge of the environment involves multiple types of intelligence and infrastructure data, and is by nature uncertain or approximate. We use a collection of situation-aware agents to infer potential threats in such environments, where agents are responsible for event correlation and situation assessment. We review the weaknesses of a current approach to threat assessment in Homeland Security and then describe our agent-based approach. The key innovations of our agent-based approach are: an ontological commitment to events and situations, fuzzy event correlation, fuzzy situation assessment, adaptability and learning during threat assessment operations, and an enhancement of traditional belief-desire-intention (BDI) agents with situation awareness. We describe the properties of situation-aware BDI agents and discuss the implementation of them on a variety of BDI agent platforms. Lastly, we discuss the interoperability of these platforms and address the issue of scalability through coupling to large-scale peer-to-peer overlays.  相似文献   

9.
In this paper, we explore the way the discovery of service can be facilitated or not by utilizing service location information that is opportunistically disseminated primarily by the service consumers themselves. We apply our study to the real-world case of parking service in busy city areas. As the vehicles drive around the area, they opportunistically collect and share with each other information on the location and status of each parking spot they encounter. This opportunistically assisted scenario is compared against one that implements a “blind” non-assisted search and a centralized approach, where the allocation of parking spots is managed by a central server with global knowledge about the parking space availability.Results obtained for both uniformly distributed travel destinations and a single hotspot destination reveal that the relative performance of the three solutions can vary significantly and not always inline with intuition. Under the hotspot scenario, the opportunistic system is consistently outperformed by the centralized system, which yields the minimum times and distances at the expense of more distant parking spot assignments; whereas, for uniformly distributed destinations, the relative performance of all three schemes changes with the vehicle volume, with the centralized approach gradually becoming the worst solution and the opportunistic one emerging as the best scheme. We discuss how each approach modulates the information dissemination process in space and time and resolves the competition for the parking resources. We also outline models providing analytical insights to the behaviour of the centralized approach.  相似文献   

10.
In this paper we focus on collaborative multi-agent systems, where agents are distributed over a region of interest and collaborate to achieve a common estimation goal. In particular, we introduce two consensus-based distributed linear estimators. The first one is designed for a Bayesian scenario, where an unknown common finite-dimensional parameter vector has to be reconstructed, while the second one regards the nonparametric reconstruction of an unknown function sampled at different locations by the sensors. Both of the algorithms are characterized in terms of the trade-off between estimation performance, communication, computation and memory complexity. In the finite-dimensional setting, we derive mild sufficient conditions which ensure that a distributed estimator performs better than the local optimal ones in terms of estimation error variance. In the nonparametric setting, we introduce an on-line algorithm that allows the agents to simultaneously compute the function estimate with small computational, communication and data storage efforts, as well as to quantify its distance from the centralized estimate given by a Regularization Network, one of the most powerful regularized kernel methods. These results are obtained by deriving bounds on the estimation error that provide insights on how the uncertainty inherent in a sensor network, such as imperfect knowledge on the number of agents and the measurement models used by the sensors, can degrade the performance of the estimation process. Numerical experiments are included to support the theoretical findings.  相似文献   

11.
Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of self-interested decision-making frameworks. Agents engaged in individual decision making in multiagent settings face the task of having to reason about other agents’ actions, which may in turn involve reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. For the purposes of this study, individual, self-interested decision making in multiagent settings is modeled using interactive dynamic influence diagrams (I-DID). These are graphical models with the benefit that they naturally offer a factored representation of the problem, allowing agents to ascribe dynamic models to others and reason about them. We demonstrate that an implication of bounded, finitely-nested reasoning by a self-interested agent is that we may not obtain optimal team solutions in cooperative settings, if it is part of a team. We address this limitation by including models at level 0 whose solutions involve reinforcement learning. We show how the learning is integrated into planning in the context of I-DIDs. This facilitates optimal teammate behavior, and we demonstrate its applicability to ad hoc teamwork on several problem domains and configurations.  相似文献   

12.
This paper analyzes the emergent behaviors of pedestrian groups that learn through the multiagent reinforcement learning model developed in our group. Five scenarios studied in the pedestrian model literature, and with different levels of complexity, were simulated in order to analyze the robustness and the scalability of the model. Firstly, a reduced group of agents must learn by interaction with the environment in each scenario. In this phase, each agent learns its own kinematic controller, that will drive it at a simulation time. Secondly, the number of simulated agents is increased, in each scenario where agents have previously learnt, to test the appearance of emergent macroscopic behaviors without additional learning. This strategy allows us to evaluate the robustness and the consistency and quality of the learned behaviors. For this purpose several tools from pedestrian dynamics, such as fundamental diagrams and density maps, are used. The results reveal that the developed model is capable of simulating human-like micro and macro pedestrian behaviors for the simulation scenarios studied, including those where the number of pedestrians has been scaled by one order of magnitude with respect to the situation learned.  相似文献   

13.
This paper studies regulated state synchronization for continuous‐time homogeneous multiagent systems with weakly unstable agents where the reference trajectory is given by a so‐called exosystem. The agents share part of their state over a communication network. We assume that the communication topology is completely unknown and directed. An algebraic Riccati equation–based low‐gain adaptive nonlinear dynamic protocol design is presented to achieve the regulated state synchronizations. Utilizing the adaptive control, our nonlinear dynamic protocol is universal and does not depend on any information about the communication topology or the number of agents.  相似文献   

14.
This paper considers a consensus problem for hybrid multiagent systems, which comprise two groups of agents: a group of continuous‐time dynamic agents and a group of discrete‐time dynamic agents. Firstly, a game‐theoretic approach is adopted to model the interactions between the two groups of agents. To achieve consensus for the considered hybrid multiagent systems, the cost functions are designed. Moreover, it is shown that the designed game admits a unique Nash equilibrium. Secondly, sufficient/necessary conditions of solving consensus are established. Thirdly, we find that the convergence speed of the system depends on the game. By the mechanism design of the game, the convergence speed is increased. Finally, simulation examples are given to validate the effectiveness of the theoretical results.  相似文献   

15.
We discuss the solution of complex multistage decision problems using methods that are based on the idea of policy iteration(PI),i.e.,start from some base policy and generate an improved policy.Rollout is the simplest method of this type,where just one improved policy is generated.We can view PI as repeated application of rollout,where the rollout policy at each iteration serves as the base policy for the next iteration.In contrast with PI,rollout has a robustness property:it can be applied on-line and is suitable for on-line replanning.Moreover,rollout can use as base policy one of the policies produced by PI,thereby improving on that policy.This is the type of scheme underlying the prominently successful Alpha Zero chess program.In this paper we focus on rollout and PI-like methods for problems where the control consists of multiple components each selected(conceptually)by a separate agent.This is the class of multiagent problems where the agents have a shared objective function,and a shared and perfect state information.Based on a problem reformulation that trades off control space complexity with state space complexity,we develop an approach,whereby at every stage,the agents sequentially(one-at-a-time)execute a local rollout algorithm that uses a base policy,together with some coordinating information from the other agents.The amount of total computation required at every stage grows linearly with the number of agents.By contrast,in the standard rollout algorithm,the amount of total computation grows exponentially with the number of agents.Despite the dramatic reduction in required computation,we show that our multiagent rollout algorithm has the fundamental cost improvement property of standard rollout:it guarantees an improved performance relative to the base policy.We also discuss autonomous multiagent rollout schemes that allow the agents to make decisions autonomously through the use of precomputed signaling information,which is sufficient to maintain the cost improvement property,without any on-line coordination of control selection between the agents.For discounted and other infinite horizon problems,we also consider exact and approximate PI algorithms involving a new type of one-agent-at-a-time policy improvement operation.For one of our PI algorithms,we prove convergence to an agentby-agent optimal policy,thus establishing a connection with the theory of teams.For another PI algorithm,which is executed over a more complex state space,we prove convergence to an optimal policy.Approximate forms of these algorithms are also given,based on the use of policy and value neural networks.These PI algorithms,in both their exact and their approximate form are strictly off-line methods,but they can be used to provide a base policy for use in an on-line multiagent rollout scheme.  相似文献   

16.
In this paper, a bipartite consensus problem is considered for a high‐order multiagent system with cooperative‐competitive interactions and unknown time‐varying disturbances. A signed graph is used to describe the interaction network associated with the multiagent system. The unknown disturbances are expressed by linearly parameterized models, and distributed adaptive laws are designed to estimate the unknown parameters in the models. For the case that there is no exogenous reference system, a fully distributed adaptive control law is proposed to ensure that all the agents reach a bipartite consensus. For the other case that there exists an exogenous reference system, another fully distributed adaptive control law is also developed to ensure that all the agents achieve bipartite consensus on the state of the exogenous system. The stability of the closed‐loop multiagent systems with the 2 proposed adaptive control laws are analyzed under an assumption that the interaction network is structurally balanced. Moreover, the convergence of the parameter estimation errors is guaranteed with a persistent excitation condition. Finally, simulation examples are provided to demonstrate the effectiveness of the proposed adaptive bipartite consensus control laws for the concerned multiagent system.  相似文献   

17.
In this paper, we study the cooperative robust output regulation problem for linear uncertain multiagent systems with both communication delay and input delay by the distributed internal model approach. The problem includes the leader‐following consensus problem of linear multiagent systems with time delay as a special case. We first generalize the internal model design method to systems with both communication delay and input delay. Then, under a set of standard assumptions, we have obtained the solution to the problem via both the state feedback control law and the output feedback control law. In contrast to the existing results, our results apply to general linear uncertain multiagent systems, accommodate a large class of leader signals, and achieve asymptotic tracking and disturbance rejection at the same time.  相似文献   

18.
When we negotiate, the arguments uttered to persuade the opponent are not the result of an isolated analysis, but of an integral view of the problem that we want to agree about. Before the negotiation starts, we have in mind what arguments we can utter, what opponent we can persuade, which negotiation can finish successfully and which cannot. Thus, we plan the negotiation, and in particular, the argumentation. This fact allows us to take decisions in advance and to start the negotiation more confidently. With this in mind, we claim that this planning can be exploited by an autonomous agent. Agents plan the actions that they should execute to achieve their goals. In these plans, some actions are under the agent's control, while some others are not. The latter must be negotiated with other agents. Negotiation is usually carried out during the plan execution. In our opinion, however, negotiation can be considered during the planning stage, as in real life. In this paper, we present a novel approach to integrate argumentation-based negotiation planning into the general planning process of an autonomous agent. This integration allows the agent to take key decisions in advance. We evaluated this proposal in a multiagent scenario by comparing the performance of agents that plan the argumentation and agents that do not. These evaluations demonstrated that performance improves when the argumentation is planned, specially, when the negotiation alternatives increase.  相似文献   

19.
Statistical relational learning of trust   总被引:1,自引:0,他引:1  
The learning of trust and distrust is a crucial aspect of social interaction among autonomous, mentally-opaque agents. In this work, we address the learning of trust based on past observations and context information. We argue that from the truster’s point of view trust is best expressed as one of several relations that exist between the agent to be trusted (trustee) and the state of the environment. Besides attributes expressing trustworthiness, additional relations might describe commitments made by the trustee with regard to the current situation, like: a seller offers a certain price for a specific product. We show how to implement and learn context-sensitive trust using statistical relational learning in form of a Dirichlet process mixture model called Infinite Hidden Relational Trust Model (IHRTM). The practicability and effectiveness of our approach is evaluated empirically on user-ratings gathered from eBay. Our results suggest that (i) the inherent clustering achieved in the algorithm allows the truster to characterize the structure of a trust-situation and provides meaningful trust assessments; (ii) utilizing the collaborative filtering effect associated with relational data does improve trust assessment performance; (iii) by learning faster and transferring knowledge more effectively we improve cold start performance and can cope better with dynamic behavior in open multiagent systems. The later is demonstrated with interactions recorded from a strategic two-player negotiation scenario.  相似文献   

20.
In this paper, we present a framework for interacting with users that is sensitive to the cost of bother and then focus on its application to decision making in hospital emergency room scenarios. We begin with a model designed for reasoning about interaction in a single-agent single-user setting and then expand to the environment of multiagent systems. In this setting, agents consider both whether to ask other agents to perform decision making and at the same time whether to ask questions of these agents. With this fundamental research as a backdrop, we project the framework into the application of reasoning about which medical experts to interact with, sensitive to possible bother, during hospital decision scenarios, in order to deliver the best care for the patients that arrive. Due to the real-time nature of the application and the knowledge-intensive nature of the decisions, we propose new parameters to include in the reasoning about interaction and sketch their usefulness through a series of examples. We then include a set of experimental results confirming the value of our proposed approach for reasoning about interaction in hospital settings, through simulations of patient care in those environments. We conclude by pointing to future research to continue to extend the model for reasoning about interaction in multiagent environments for the setting of time-critical care in hospital settings.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号