首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present a temporal reasoning mechanism for an individual agent situated in a dynamic environment such as the web and collaborating with other agents while interleaving planning and acting. Building a collaborative agent that can flexibly achieve its goals in changing environments requires a blending of real-time computing and AI technologies. Therefore, our mechanism consists of an Artificial Intelligence (AI) planning subsystem and a Real-Time (RT) scheduling subsystem. The AI planning subsystem is based on a model for collaborative planning. The AI planning subsystem generates a partial order plan dynamically. During the planning it sends the RT scheduling subsystem basic actions and time constraints. The RT scheduling subsystem receives the dynamic basic actions set with associated temporal constraints and inserts these actions into the agent's schedule of activities in such a way that the resulting schedule is feasible and satisfies the temporal constraints. Our mechanism allows the agent to construct its individual schedule independently. The mechanism handles various types of temporal constraints arising from individual activities and its collaborators. In contrast to other works on scheduling in planning systems which are either not appropriate for uncertain and dynamic environments or cannot be expanded for use in multi-agent systems, our mechanism enables the individual agent to determine the time of its activities in uncertain situations and to easily integrate its activities with the activities of other agents. We have proved that under certain conditions temporal reasoning mechanism of the AI planning subsystem is sound and complete. We show the results of several experiments on the system. The results demonstrate that interleave planning and acting in our environment is crucial.  相似文献   

2.
When we negotiate, the arguments uttered to persuade the opponent are not the result of an isolated analysis, but of an integral view of the problem that we want to agree about. Before the negotiation starts, we have in mind what arguments we can utter, what opponent we can persuade, which negotiation can finish successfully and which cannot. Thus, we plan the negotiation, and in particular, the argumentation. This fact allows us to take decisions in advance and to start the negotiation more confidently. With this in mind, we claim that this planning can be exploited by an autonomous agent. Agents plan the actions that they should execute to achieve their goals. In these plans, some actions are under the agent's control, while some others are not. The latter must be negotiated with other agents. Negotiation is usually carried out during the plan execution. In our opinion, however, negotiation can be considered during the planning stage, as in real life. In this paper, we present a novel approach to integrate argumentation-based negotiation planning into the general planning process of an autonomous agent. This integration allows the agent to take key decisions in advance. We evaluated this proposal in a multiagent scenario by comparing the performance of agents that plan the argumentation and agents that do not. These evaluations demonstrated that performance improves when the argumentation is planned, specially, when the negotiation alternatives increase.  相似文献   

3.
To date, many researchers have proposed various methods to improve the learning ability in multiagent systems. However, most of these studies are not appropriate to more complex multiagent learning problems because the state space of each learning agent grows exponentially in terms of the number of partners present in the environment. Modeling other learning agents present in the domain as part of the state of the environment is not a realistic approach. In this paper, we combine advantages of the modular approach, fuzzy logic and the internal model in a single novel multiagent system architecture. The architecture is based on a fuzzy modular approach whose rule base is partitioned into several different modules. Each module deals with a particular agent in the environment and maps the input fuzzy sets to the action Q-values; these represent the state space of each learning module and the action space, respectively. Each module also uses an internal model table to estimate actions of the other agents. Finally, we investigate the integration of a parallel update method with the proposed architecture. Experimental results obtained on two different environments of a well-known pursuit domain show the effectiveness and robustness of the proposed multiagent architecture and learning approach.  相似文献   

4.
Autonomous agents that learn about their environment can be divided into two broad classes. One class of existing learners, reinforcement learners, typically employ weak learning methods to directly modify an agent's execution knowledge. These systems are robust in dynamic and complex environments but generally do not support planning or the pursuit of multiple goals. In contrast, symbolic theory revision systems learn declarative planning knowledge that allows them to pursue multiple goals in large state spaces, but these approaches are generally only applicable to fully sensed, deterministic environments with no exogenous events. This research investigates the hypothesis that by limiting an agent to procedural access to symbolic planning knowledge, the agent can combine the powerful, knowledge-intensive learning performance of the theory revision systems with the robust performance in complex environments of the reinforcement learners. The system, IMPROV, uses an expressive knowledge representation so that it can learn complex actions that produce conditional or sequential effects over time. By developing learning methods that only require limited procedural access to the agent's knowledge, IMPROV's learning remains tractable as the agent's knowledge is scaled to large problems. IMPROV learns to correct operator precondition and effect knowledge in complex environments that include such properties as noise, multiple agents and time-critical tasks, and demonstrates a general learning method that can be easily strengthened through the addition of many different kinds of knowledge.  相似文献   

5.
We consider an autonomous agent facing a stochastic, partially observable, multiagent environment. In order to compute an optimal plan, the agent must accurately predict the actions of the other agents, since they influence the state of the environment and ultimately the agent’s utility. To do so, we propose a special case of interactive partially observable Markov decision process, in which the agent does not explicitly model the other agents’ beliefs and preferences, and instead represents them as stochastic processes implemented by probabilistic deterministic finite state controllers (PDFCs). The agent maintains a probability distribution over the PDFC models of the other agents, and updates this belief using Bayesian inference. Since the number of nodes of these PDFCs is unknown and unbounded, the agent places a Bayesian nonparametric prior distribution over the infinitely dimensional set of PDFCs. This allows the size of the learned models to adapt to the complexity of the observed behavior. Deriving the posterior distribution is in this case too complex to be amenable to analytical computation; therefore, we provide a Markov chain Monte Carlo algorithm that approximates the posterior beliefs over the other agents’ PDFCs, given a sequence of (possibly imperfect) observations about their behavior. Experimental results show that the learned models converge behaviorally to the true ones. We consider two settings, one in which the agent first learns, then interacts with other agents, and one in which learning and planning are interleaved. We show that the agent’s performance increases as a result of learning in both situations. Moreover, we analyze the dynamics that ensue when two agents are simultaneously learning about each other while interacting, showing in an example environment that coordination emerges naturally from our approach. Furthermore, we demonstrate how an agent can exploit the learned models to perform indirect inference over the state of the environment via the modeled agent’s actions.  相似文献   

6.
Agent's flexibility and autonomy, as well as their capacity to coordinate and cooperate, are some of the features which make multiagent systems useful to work in dynamic and distributed environments. These key features are directly related to the way in which agents communicate and perceive each other, as well as their environment and surrounding conditions. Traditionally, this has been accomplished by means of message exchange or by using blackboard systems. These traditional methods have the advantages of being easy to implement and well supported by multiagent platforms; however, their main disadvantage is that the amount of social knowledge in the system directly depends on every agent actively informing of what it is doing, thinking, perceiving, etc. There are domains, for example those where social knowledge depends on highly distributed pieces of data provided by many different agents, in which such traditional methods can produce a great deal of overhead, hence reducing the scalability, efficiency and flexibility of the multiagent system. This work proposes the use of event tracing in multiagent systems, as an indirect interaction and coordination mechanism to improve the amount and quality of the information that agents can perceive from both their physical and social environment, in order to fulfill their goals more efficiently. In order to do so, this work presents an abstract model of a tracing system and an architectural design of such model, which can be incorporated to a typical multiagent platform.  相似文献   

7.
Computer science in general, and artificial intelligence and multiagent systems in particular, are part of an effort to build intelligent transportation systems. An efficient use of the existing infrastructure relates closely to multiagent systems as many problems in traffic management and control are inherently distributed. In particular, traffic signal controllers located at intersections can be seen as autonomous agents. However, challenging issues are involved in this kind of modeling: the number of agents is high; in general agents must be highly adaptive; they must react to changes in the environment at individual level while also causing an unpredictable collective pattern, as they act in a highly coupled environment. Therefore, traffic signal control poses many challenges for standard techniques from multiagent systems such as learning. Despite the progress in multiagent reinforcement learning via formalisms based on stochastic games, these cannot cope with a high number of agents due to the combinatorial explosion in the number of joint actions. One possible way to reduce the complexity of the problem is to have agents organized in groups of limited size so that the number of joint actions is reduced. These groups are then coordinated by another agent, a tutor or supervisor. Thus, this paper investigates the task of multiagent reinforcement learning for control of traffic signals in two situations: agents act individually (individual learners) and agents can be “tutored”, meaning that another agent with a broader sight will recommend a joint action.  相似文献   

8.
Open multi-agent systems (MAS) are decentralised and distributed systems that consist of a large number of loosely coupled autonomous agents. In the absence of centralised control they tend to be difficult to manage, especially in an open environment, which is dynamic, complex, distributed and unpredictable. This dynamism and uncertainty in an open environment gives rise to unexpected plan failures. In this paper we present an abstract knowledge based approach for the diagnosis and recovery of plan action failures. Our approach associates a sentinel agent with each problem solving agent in order to monitor the problem solving agent’s interactions. The proposed approach also requires the problem solving agents to be able to report on the status of a plan’s actions.Once an exception is detected the sentinel agents start an investigation of the suspected agents. The sentinel agents collect information about the status of failed plan abstract actions and knowledge about agents’ mental attitudes regarding any failed plan. The sentinel agent then uses this abstract knowledge and the agents’ mental attitudes, to diagnose the underlying cause of the plan failure. The sentinel agent may ask the problem solving agent to retry their failed plan based on the diagnostic result.  相似文献   

9.
Distributed intelligent architecture for logistics (DIAL)   总被引:4,自引:0,他引:4  
An ideal logistics problem is considered as a network flow problem which generates a logistics plan and subsequently executes the plan. A real-world logistics plan is different from its ideal counterpart modeled as a network flow problem in the sense that each node of the logistics graph is operated independently with disparate objectives. In contrast to the nodes of a network flow problem, agents are considered as software entities which embody elegant reasoning ability to justify their own actions towards individual objectives, and also interact with other agents. Hence, a group of agents or a multiagent system is best suited to solve real-world logistics problems with each agent representing a node of the graph. We have built a three-tier framework where a customer's problem can be decomposed and assigned to all the agents which together generate a logistics plan. We employ two simulation software as planning tools which enable us to simulate appropriate events. The key ideas behind this paper are large-scale multiagent architectural modeling issues (scalability), computation task control, information sharing among several customers, and a problem solving procedure before the planning process. The problem solving procedure is considered as determining the computational tasks required to be invoked to initiate the planning process. We describe the implementation of the framework.  相似文献   

10.
We present a flexible initial framework for defining self‐motivated, self‐aware agents in simulated worlds, planning continuously so as to maximize long‐term rewards. While such agents employ reasoned exploration of feasible sequences of actions and corresponding states, they also behave opportunistically and recover from failure, thanks to their continual plan updates and quest for rewards. Our framework allows for both specific and general (quantified) knowledge and for epistemic predicates such as knowing‐that and knowing‐whether. Because realistic agents have only partial knowledge of their world, the reasoning of the proposed agents uses a weakened closed‐world assumption; this has consequences for epistemic reasoning, in particular introspection. The planning operators allow for quantitative, gradual change and side effects such as the passage of time, changes in distances and rewards, and language production, using a uniform procedural attachment method. Question answering (involving introspection) and experimental runs are shown for our particular agent ME in a simple world, demonstrating the value of continual deliberate, reward‐driven planning. Though the primary merit of agents definable in our framework is that they combine all of the aforementioned features, they can also be configured as single or multiple goal‐seeking agents and as such perform comparably with some recent experimental agents.  相似文献   

11.
Planning algorithms are often applied by intelligent agents for achieving their goals. For the plan creation, this kind of algorithm uses only an initial state definition, a set of actions, and a goal; while agents also have preferences and desires that should to be taken into account. Thus, agents need to spend time analyzing each plan returned by these algorithms to find one that satisfies their preferences. In this context, we have studied an alternative in which a classical planner could be modified to accept a new conceptual parameter for a plan creation: an agent mental state composed by preferences and constraints. In this work, we present a planning algorithm that extends a partial order algorithm to deal with the agent’s preferences. In this way, our algorithm builds an adequate plan in terms of agent mental state. In this article, we introduce this algorithm and expose experimental results showing the advantages of this adaptation.  相似文献   

12.
One shortcoming with most AI planning systems has been an inability to deal with execution-time discrepancies between actual and expected situations. Often, these exception situations jeopardize the immediate integrity and safety of the planning agent or its surroundings, with the only recourse being more time-consuming plan generation. In order to avoid such situations, potential exceptions must be predicted during plan execution. Since many application domains (particularly for autonomous systems) are inherently dynamic — in the sense that information is at best incomplete, perhaps erroneous, and changes over time independent of a planning agent's actions — managing action in the world becomes a difficult problem. Action and events in dynamic worlds must be monitored in order to coordinate an agent's actions with its surroundings. This allows the agent to predict and plan for potential future exception situations while acting in the present.This paper introduces an approach to autonomous reaction in dynamic environments. We have avoided the traditional distinction between generating and then executing plans through the use of a dynamic reaction system, which handles potential exception situations gracefully as it carries out assigned tasks. The reaction system manages constraints imposed by ongoing activity in the world, as well as those derived from long-term planning, to control observable behaviour. This approach provides the necessary stimulus/response behaviour required in dynamic situations, while using goal-directed constraints as heuristics for improved reactions.We present an overview of the salient features of dynamic worlds and their impact on traditional planning, introduce our model of dynamic reactivity, describe an implementation of the model and its performance in a dynamic simulation environment, and present an architecture incorporating long-term planning with short-term reactance suitable for autonomous systems applications.  相似文献   

13.
This paper proposes a path planning technique for autonomous agent(s) located in an unstructured networked distributed environment, where each agent has limited and not complete knowledge of the environment. Each agent has only the knowledge available in the distributed memory of the computing node the agent is running on and the agents share some information learned over a distributed network. In particular, the environment is divided into several sectors with each sector located on a single separate distributed computing node. We consider hybrid reactive-cognitive agent(s) where we use autonomous agent motion planning that is based on the use of a potential field model accompanied by a reinforcement learning as well as boundary detection algorithms. Potential fields are used for fast convergence toward a path in a distributed environment while reenforcement learning is used to guarantee a variety of behavior and consistent convergence in a distributed environment. We show how the agent decision making process is enhanced by the combination of the two techniques in a distributed environment. Furthermore, path retracing is a challenging problem in a distributed environment, since the agent does not have complete knowledge of the environment. We propose a backtracking technique to keep the distributed agent informed all the time of its path information and step count including when migrating from one node to another. Note that no node has knowledge of the entire global path from a source to a goal when such a goal resides on a separate node. Each agent has only knowledge of a partial path (internal to a node) and related number of steps corresponding to the portion of the path that agent traversed when running on the node. In particular, we show how each of the agents(s), starting in one of the many sectors with no initial knowledge of the environment, using the proposed distributed technique, develops its intelligence based on its experience and seamlessly discovers the shortest global path to the target, which is located in a different node, while avoiding any obstacle(s) it encounters in its way, including when transitioning and migrating from one distributed computing node to another. The agent(s) use (s) multiple-token-ring message passing interface (MPI) to perform internode communication. Finally, the experimental results of the proposed method show that single and multiagents sharing the same goal and running on the same or different nodes successfully coordinate the sharing of their respective environment states/information to collaboratively perform their respective tasks. The results also show that distributed multiagent sharing information increases by an order of magnitude the speed of convergence to the optimal shortest path to the goal in comparison with the single-agent case or noninformation sharing multiagent case.  相似文献   

14.
基于Petri网的实时多智能体系统建模   总被引:1,自引:0,他引:1  
给出基于Petri网的实时多Agent系统建模方法,它通过Petri网建立由接口模块、目标模块、计划模块、调度模块、知识库模块、环境模块、内部模块和控制模块组成的实时Agent模型,抽象和清晰地描述出实时Agent内部和外部特征。  相似文献   

15.
This research presents an optimization technique for route planning and exploration in unknown environments. It employs the hybrid architecture that implements detection, avoidance and planning using autonomous agents with coordination capabilities. When these agents work for a common objective, they require a robust information interchange module for coordination. They cannot achieve the goal when working independently. The coordination module enhances their performance and efficiency. The multi agent systems can be employed for searching items in unknown environments. The searching of unexploded ordinance such as the land mines is an important application where multi agent systems can be best employed. The hybrid architecture incorporates learning real time A* algorithm for route planning and compares it with A* searching algorithm. Learning real time A* shows better results for multi agent environment and proved to be efficient and robust algorithm. A simulated ant agent system is presented for route planning and optimization and proved to be efficient and robust for large and complex environments.  相似文献   

16.
Communication is an important resource for multiagent coordination. Interactive Dynamic Influence Diagrams (I-DIDs) have been used extensively in multiagent planning when there is uncertainty, and they are recognized graphical representations of Interactive Partially Observable Markov Decision Processes (I-POMDPs). We establish a communication model among multiple agents based on the I-DID framework. We use the AND-communication method by assuming a separate communication and action phase in each step, rather than replacing domain actions, in order that communication facilitates better domain-action selection. We use a synchronized communication type: when an agent initiates communication, all of the agent’s teammates synchronize to share their recent observations. We give a general algorithm to calculate communicative decision from a single-agent perspective by comparing expected rewards with and without communication. Finally, we use multiagent “tiger” and “concert” problems to validate the model’s effectiveness.  相似文献   

17.
A multiagent genetic algorithm for global numerical optimization.   总被引:21,自引:0,他引:21  
In this paper, multiagent systems and genetic algorithms are integrated to form a new algorithm, multiagent genetic algorithm (MAGA), for solving the global numerical optimization problem. An agent in MAGA represents a candidate solution to the optimization problem in hand. All agents live in a latticelike environment, with each agent fixed on a lattice-point. In order to increase energies, they compete or cooperate with their neighbors, and they can also use knowledge. Making use of these agent-agent interactions, MAGA realizes the purpose of minimizing the objective function value. Theoretical analyzes show that MAGA converges to the global optimum. In the first part of the experiments, ten benchmark functions are used to test the performance of MAGA, and the scalability of MAGA along the problem dimension is studied with great care. The results show that MAGA achieves a good performance when the dimensions are increased from 20-10,000. Moreover, even when the dimensions are increased to as high as 10,000, MAGA still can find high quality solutions at a low computational cost. Therefore, MAGA has good scalability and is a competent algorithm for solving high dimensional optimization problems. To the best of our knowledge, no researchers have ever optimized the functions with 10,000 dimensions by means of evolution. In the second part of the experiments, MAGA is applied to a practical case, the approximation of linear systems, with a satisfactory result.  相似文献   

18.
Multiagent based differential evolution approach to optimal power flow   总被引:1,自引:0,他引:1  
This paper proposes a new differential evolution approach named as multiagent based differential evolution (MADE) based on multiagent systems, for solving optimal power flow problem with non-smooth and non-convex generator fuel cost curves. This method integrates multiagent systems (MAS) and differential evolution (DE) algorithm. An agent in MADE represents an individual to DE and a candidate solution to the optimization problem. All agents live in a lattice like environment, with each agent fixed on a lattice point. In order to obtain optimal solution quickly, each agent competes and cooperates with its neighbors and it can also use knowledge. Making use of these agent-agent interaction and DE mechanism, MADE realizes the purpose of minimizing the value of objective function. MADE applied to optimal power flow is evaluated on 6 bus system and IEEE 30 bus system with different generator characteristics. Simulation results show that the proposed method converges to better solutions much faster than earlier reported approaches.  相似文献   

19.
Collaborative privacy-preserving planning (CPPP) is a multi-agent planning task in which agents need to achieve a common set of goals without revealing certain private information. In many CPPP algorithms, the individual agents reason about a projection of the multi-agent problem onto a single-agent classical planning problem. For example, an agent can plan as if it controls the public actions of other agents, ignoring any private preconditions and effects theses actions may have, and use the cost of this plan as a heuristic estimate of the cost of the full, multi-agent plan. Using such a projection, however, ignores some dependencies between agents’ public actions. In particular, it does not contain dependencies between public actions of other agents caused by their private facts. We propose a projection in which these private dependencies are maintained. The benefit of our dependency-preserving projection is demonstrated by using it to produce high-level plans in a new privacy-preserving planner, and as a heuristic for guiding forward search privacy-preserving algorithms. Both are able to solve more benchmark problems than any other state-of-the-art privacy-preserving planner. This more informed projection does not explicitly expose any private fact, action, or precondition. In addition, we show that even if an adversary agent knows that an agent has some private objects of a given type (e.g., trucks), it cannot infer the number of such private objects that the agent controls. This introduces a novel form of strong privacy, which we call object-cardinality privacy, that is motivated by real-world requirements.  相似文献   

20.
One problem in the design of multi-agent systems is the difficulty of predicting the occurrences that one agent might face, also to recognize and to predict their optimum behavior in these situations. Therefore, one of the most important characteristic of the agent is their ability during adoption, to learn, and correct their behavior. With consideration of the continuously changing environment, the back and forth learning of the agents, the inability to see the agent’s action first hand, and their chosen strategies, learning in a multi-agent environment can be very complex. On the one hand, with recognition to the current learning models that are used in deterministic environment that behaves linearly, which contain weaknesses; therefore, the current learning models are unproductive in complex environments that the actions of agents are stochastic. Therefore, it is necessary for the creation of learning models that are effective in stochastic environments. Purpose of this research is the creation of such a learning model. For this reason, the Hopfield and Boltzmann learning algorithms are used. In order to demonstrate the performance of their algorithms, first, an unlearned multi-agent model is created. During the interactions of the agents, they try to increase their knowledge to reach a specific value. The predicated index is the number of changed states needed to reach the convergence. Then, the learned multi-agent model is created with the Hopfield learning algorithm, and in the end, the learned multi-agent model is created with the Boltzmann learning algorithm. After analyzing the obtained figures, a conclusion can be made that when learning impose to multi-agent environment the average number of changed states needed to reach the convergence decreased and the use of Boltzmann learning algorithm decreased the average number of changed states even further in comparison with Hopfield learning algorithm due to the increase in the number of choices in each situation. Therefore, it is possible to say that the multi-agent systems behave stochastically, the more closer they behave to their true character, the speed of reaching the global solution increases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号