首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In contingent planning problems, agents have partial information about their state and use sensing actions to learn the value of some variables. When sensing and actuation are separated, plans for such problems can often be viewed as a tree of sensing actions, separated by conformant plans consisting of non-sensing actions that enable the execution of the next sensing action. We propose a heuristic, online method for contingent planning which focuses on identifying the next useful sensing action. We select the next sensing action based on a landmark heuristic, adapted from classical planning. We discuss landmarks for plan trees, providing several alternative definitions and discussing their merits. The key part of our planner is the novel landmarks-based heuristic, together with a projection method that uses classical planning to solve the intermediate conformant planning problems. The resulting heuristic contingent planner solves many more problems than state-of-the-art, translation-based online contingent planners, and in most cases, much faster, up to 3 times faster on simple problems, and 200 times faster on non-simple domains.  相似文献   

2.
This paper proposes FMAP (Forward Multi-Agent Planning), a fully-distributed multi-agent planning method that integrates planning and coordination. Although FMAP is specifically aimed at solving problems that require cooperation among agents, the flexibility of the domain-independent planning model allows FMAP to tackle multi-agent planning tasks of any type. In FMAP, agents jointly explore the plan space by building up refinement plans through a complete and flexible forward-chaining partial-order planner. The search is guided by h D T G , a novel heuristic function that is based on the concepts of Domain Transition Graph and frontier state and is optimized to evaluate plans in distributed environments. Agents in FMAP apply an advanced privacy model that allows them to adequately keep private information while communicating only the data of the refinement plans that is relevant to each of the participating agents. Experimental results show that FMAP is a general-purpose approach that efficiently solves tightly-coupled domains that have specialized agents and cooperative goals as well as loosely-coupled problems. Specifically, the empirical evaluation shows that FMAP outperforms current MAP systems at solving complex planning tasks that are adapted from the International Planning Competition benchmarks.  相似文献   

3.
Real-time motion planning and control for groups of heterogeneous and under-actuated robots subject to disturbances and uncertainties in cluttered constrained environments is the key problem addressed in this paper. Here we present the Multi-agent Rapidly-exploring Pseudo-random Tree (MRPT), a novel technique based on a classical Probabilistic Road Map (PRM) algorithm for application in robot team cooperation. Our main contribution lies in the proposal of an extension of a probabilistic approach to be used as a deterministic planner in distributed complex multi-agent systems, keeping the main advantages of PRM strategies like simplicity, fast convergence, and probabilistic completeness. Our methodology is fully distributed, addressing missions with multi-robot teams represented by high nonlinear models and a great number of Degrees of Freedom (DoFs), endowing each agent with the ability of coordinating its own movement with other agents while avoiding collisions with obstacles. The inference of the entire team’s behavior at each time instant by each individual agent is the main improvement of our method. This scheme, which is behavioral in nature, also makes the system less susceptible to failures due to intensive traffic communication among robots. We evaluate the time complexity of our method and show its applicability in planning and executing search and rescue missions for a group of robots in S E3 outdoor scenarios and present both simulated and real-world results.  相似文献   

4.
In this paper we propose a machine learning technique for real-time robot path planning for an autonomous robot in a planar environment with obstacles where the robot possess no a priori map of its environment. Our main insight in this paper is that a robot’s path planning times can be significantly reduced if it can refer to previous maneuvers it used to avoid obstacles during earlier missions, and adapt that information to avoid obstacles during its current navigation. We propose an online path planning algorithm called LearnerRRT that utilizes a pattern matching technique called Sample Consensus Initial Alignment (SAC-IA) in combination with an experience-based learning technique to adapt obstacle boundary patterns encountered in previous environments to the current scenario followed by corresponding adaptations in the obstacle-avoidance paths. Our proposed algorithm LearnerRRT works as a learning-based reactive path planning technique which enables robots to improve their overall path planning performance by locally improving maneuvers around commonly encountered obstacle patterns by accessing previously accumulated environmental information. We have conducted several experiments in simulations and hardware to verify the performance of the LearnerRRT algorithm and compared it with a state-of-the-art sampling-based planner. LearnerRRT on average takes approximately 10% of the planning time and 14% of the total time taken by the sampling-based planner to solve the same navigation task based on simulation results and takes only 33% of the planning time, 46% of total time and 95% of total distance compared to the sampling-based planner based on our hardware results.  相似文献   

5.
In many cases several entities, such as commercial companies, need to work together towards the achievement of joint goals, while hiding certain private information. To collaborate effectively, some sort of plan is needed to coordinate the different entities. We address the problem of automatically generating such a coordination plan while preserving the agents’ privacy. Maintaining privacy is challenging when planning for multiple agents, especially when tight collaboration is needed and a global high-level view of the plan is required. In this work we present the Greedy Privacy-Preserving Planner (GPPP), a privacy preserving planning algorithm in which the agents collaboratively generate an abstract and approximate global coordination plan and then individually extend the global plan to executable plans. To guide GPPP, we propose two domain independent privacy preserving heuristics based on landmarks and pattern databases, which are classical heuristics for single agent search. These heuristics, called privacy-preserving landmarks and privacy preserving PDBs, are agnostic to the planning algorithm and can be used by other privacy-preserving planning algorithms. Empirically, we demonstrate on benchmark domains the benefits of using these heuristics and the advantage of GPPP over existing privacy preserving planners for the multi-agent STRIPS formalism.  相似文献   

6.
7.
Within the area of multi-agent systems, normative systems are a widely used framework for the coordination of interdependent activities. A crucial problem associated with normative systems is that of synthesising norms that will effectively accomplish a coordination task and that the agents will comply with. Many works in the literature focus on the on-line synthesis of a single, evolutionarily stable norm (convention) whose compliance forms a rational choice for the agents and that effectively coordinates them in one particular coordination situation that needs to be identified and modelled as a game in advance. In this work, we introduce a framework for the automatic off-line synthesis of evolutionarily stable normative systems that coordinate the agents in multiple interdependent coordination situations that cannot be easily identified in advance nor resolved separately. Our framework roots in evolutionary game theory. It considers multi-agent systems in which the potential conflict situations can be automatically enumerated by employing MAS simulations along with basic domain information. Our framework simulates an evolutionary process whereby successful norms prosper and spread within the agent population, while unsuccessful norms are discarded. The outputs of such a natural selection process are sets of codependent norms that, together, effectively coordinate the agents in multiple interdependent situations and are evolutionarily stable. We empirically show the effectiveness of our approach through empirical evaluation in a simulated traffic domain.  相似文献   

8.
Coalition logic (CL) is one of the most influential logical formalisms for strategic abilities of multi-agent systems. CL can specify what a group of agents can achieve through choices of their actions, denoted by [C]? to state that a group of agents C can have a strategy to bring about ? by collective actions, no matter what the other agents do. However, CL lacks the temporal dimension and thus can not capture the dynamic aspects of a system. Therefore, CL can not formalize the evolvement of rational mental attitudes of the agents such as knowledge, which has been shown to be very useful in specifications and verifications of distributed systems, and has received substantial amount of studies. In this paper, we introduce coalition logic of temporal knowledge (CLTK), by incorporating a temporal logic of knowledge (Halpern and Vardi’s logic of CKL n ) into CL to equip CL with the power to formalize how agents’ knowledge (individual or group knowledge) evolves over the time by coalitional forces and the temporal properties of strategic abilities as well. Furthermore, we provide an axiomatic system for CLTK and prove that it is sound and complete, along with the complexity of the satisfiability problem which is shown to be EXPTIME-complete.  相似文献   

9.
This paper studies a multi-agent scheduling problem on two identical parallel machines. There are g agents, and each agent’s objective is to minimize its makespan. We present an approximation algorithm such that the performance ratio of the makespan achieved by our algorithm relative to the minimum makespan is no more than \(i+\frac{1}{6}\) for the ith \((i=1,2,\ldots ,g)\) completed agent. Moreover, we show that the performance ratio is tight.  相似文献   

10.
Various organizations collect data about individuals for various reasons, such as service improvement. In order to mine the collected data for useful information, data publishing has become a common practice among those organizations and data analysts, research institutes, or simply the general public. The quality of published data significantly affects the accuracy of the data analysis and thus affects decision making at the corporate level. In this study, we explore the research area of privacy-preserving data publishing, i.e., publishing high-quality data without compromising the privacy of the individuals whose data are being published. Syntactic privacy models, such as k-anonymity, impose syntactic privacy requirements and make certain assumptions about an adversary’s background knowledge. To address this shortcoming, we adopt differential privacy, a rigorous privacy model that is independent of any adversary’s knowledge and insensitive to the underlying data. The published data should preserve individuals’ privacy, yet remain useful for analysis. To maintain data utility, we propose DiffMulti, a workload-aware and differentially private algorithm that employs multidimensional generalization. We devise an efficient implementation to the proposed algorithm and use a real-life data set for experimental analysis. We evaluate the performance of our method in terms of data utility, efficiency, and scalability. When compared to closely related existing methods, DiffMulti significantly improved data utility, in some cases, by orders of magnitude.  相似文献   

11.
In agents that operate in environments where decision-making needs to take into account, not only the environment, but also the minimizing actions of an opponent (as in games), it is fundamental that the agent is endowed with the ability of progressively tracing the profile of its adversaries, in such a manner that this profile aids in the process of selecting appropriate actions. However, it would be unsuitable to construct an agent with a decision-making system based only on the elaboration of such a profile, as this would prevent the agent from having its “own identity,” which would leave the agent at the mercy of its opponent. Following this direction, this study proposes an automatic Checkers player, called ACE-RL-Checkers, equipped with a dynamic decision-making module, which adapts to the profile of the opponent over the course of the game. In such a system, the action selection process is conducted through a composition of multilayer perceptron neural network and case library. In this case, the neural network represents the “identity” of the agent, i.e., it is an already trained static decision-making module. On the other hand, the case library represents the dynamic decision-making module of the agent, which is generated by the Automatic Case Elicitation technique. This technique has a pseudo-random exploratory behavior, which allows the dynamic decision-making of the agent to be directed either by the opponent’s game profile or randomly. In order to avoid a high occurrence of pseudo-random decision-making in the game initial phases—in which the agent counts on very little information about its opponent—this work proposes a new module based on sequential pattern mining for generating a base of experience rules extracted from human expert’s game records. This module will improve the agent’s move selection in the game initial phases. Experiments carried out in tournaments involving ACE-RL-Checkers and other agents correlated to this work, confirm the superiority of the dynamic architecture proposed herein.  相似文献   

12.
In the Multi-Agent Programming Contest 2017 the TUBDAI team of the Technische Universität Berlin is using the complex multi-agent scenario to evaluate the application of two frameworks from the field (multi-)robot systems. In particular the task-level decision-making and planning framework ROS Hybrid Behaviour Planner (RHBP) is used to implement the execution and decision-making for single agents. The RHBP framework builds on top of the framework Robot Operating System (ROS) that is used to implement the communication and scenario specific coordination strategy of the agents. The united team for the official contest is formed by volunteering students from a project course and their supervisors.  相似文献   

13.
In active perception tasks, an agent aims to select sensory actions that reduce its uncertainty about one or more hidden variables. For example, a mobile robot takes sensory actions to efficiently navigate in a new environment. While partially observable Markov decision processes (POMDPs) provide a natural model for such problems, reward functions that directly penalize uncertainty in the agent’s belief can remove the piecewise-linear and convex (PWLC) property of the value function required by most POMDP planners. Furthermore, as the number of sensors available to the agent grows, the computational cost of POMDP planning grows exponentially with it, making POMDP planning infeasible with traditional methods. In this article, we address a twofold challenge of modeling and planning for active perception tasks. We analyze \(\rho \)POMDP and POMDP-IR, two frameworks for modeling active perception tasks, that restore the PWLC property of the value function. We show the mathematical equivalence of these two frameworks by showing that given a \(\rho \)POMDP along with a policy, they can be reduced to a POMDP-IR and an equivalent policy (and vice-versa). We prove that the value function for the given \(\rho \)POMDP (and the given policy) and the reduced POMDP-IR (and the reduced policy) is the same. To efficiently plan for active perception tasks, we identify and exploit the independence properties of POMDP-IR to reduce the computational cost of solving POMDP-IR (and \(\rho \)POMDP). We propose greedy point-based value iteration (PBVI), a new POMDP planning method that uses greedy maximization to greatly improve scalability in the action space of an active perception POMDP. Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function. We establish the conditions under which the value function of an active perception POMDP is guaranteed to be submodular. Finally, we present a detailed empirical analysis on a dataset collected from a multi-camera tracking system employed in a shopping mall. Our method achieves similar performance to existing methods but at a fraction of the computational cost leading to better scalability for solving active perception tasks.  相似文献   

14.
The subject of multi‐agent planning has been of continuing concern in Distributed Artificial Intelligence (DAI). In this paper, we suggest an approach to multi‐agent planning that contains heuristic elements. Our method makes use of subgoals, and derived sub‐plans, to construct a global plan. Agents solve their individual sub‐plans, which are then merged into a global plan. The suggested approach reduces overall planning time and derives a plan that approximates the optimal global plan that would have been derived by a central planner, given those original subgoals. We explore three different scenarios. The first involves a group of agents with a common goal. The second considers how agents can interleave planning and execution when planning towards a common, though dynamic, goal. The third examines the case where agents, each with their own goal, can plan together to reach a state in consensus for the group. Finally, we consider how these approaches can be adapted to handle rational, manipulative agents. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

15.
Abstract

Over the last years, the planning community has formalised several models and approaches to multi-agent (MA) propositional planning. One of the main motivations in MA planning is that some or all agents have private knowledge that cannot be communicated to other agents during the planning process and the plan execution. We argue that the existing models of the multi-agent planning task do not maintain the agents’ privacy when a (strict) subset of the involved agents share confidential knowledge, or when the identity/existence of at least one agent is confidential. In this paper, first we propose a model of the MA-planning tasks that preserves the privacy of the involved agents when this happens. Then we investigate an algorithm based on best first search for our model that uses some new heuristics providing a trade-off between accuracy and agents’ privacy. Finally, an experimental study compares the effectiveness of using the proposed heuristics.  相似文献   

16.
Stream processing applications continuously process large amounts of online streaming data in real time or near real time. They have strict latency constraints. However, the continuous processing makes them vulnerable to any failures, and the recoveries may slow down the entire processing pipeline and break latency constraints. The upstream backup scheme is one of the most widely applied fault-tolerant schemes for stream processing systems. It introduces complex backup dependencies to tasks, which increases the difficulty of controlling recovery latencies. Moreover, when dependent tasks are located on the same processor, they fail at the same time in processor-level failures, bringing extra recovery latencies that increase the impacts of failures. This paper studies the relationship between the task allocation and the recovery latency of a stream processing application. We present a correlated failure effect model to describe the recovery latency of a stream topology in processor-level failures under a task allocation plan. We introduce a recovery-latency aware task allocation problem (RTAP) that seeks task allocation plans for stream topologies that will achieve guaranteed recovery latencies. We discuss the difference between RTAP and classic task allocation problems and present a heuristic algorithm with a computational complexity of O(n log2 n) to solve the problem. Extensive experiments were conducted to verify the correctness and effectiveness of our approach. It improves the resource usage by 15%–20% on average.  相似文献   

17.
Some of the current best conformant probabilistic planners focus on finding a fixed length plan with maximal probability. While these approaches can find optimal solutions, they often do not scale for large problems or plan lengths. As has been shown in classical planning, heuristic search outperforms bounded length search (especially when an appropriate plan length is not given a priori). The problem with applying heuristic search in probabilistic planning is that effective heuristics are as yet lacking.In this work, we apply heuristic search to conformant probabilistic planning by adapting planning graph heuristics developed for non-deterministic planning. We evaluate a straight-forward application of these planning graph techniques, which amounts to exactly computing a distribution over many relaxed planning graphs (one planning graph for each joint outcome of uncertain actions at each time step). Computing this distribution is costly, so we apply Sequential Monte Carlo (SMC) to approximate it. One important issue that we explore in this work is how to automatically determine the number of samples required for effective heuristic computation. We empirically demonstrate on several domains how our efficient, but sometimes suboptimal, approach enables our planner to solve much larger problems than an existing optimal bounded length probabilistic planner and still find reasonable quality solutions.  相似文献   

18.
The paper studies the problem of achieving consensus in multi-agent systems in the case where the dependency digraph Γ has no spanning in-tree. We consider the regularization protocol that amounts to the addition of a dummy agent (hub) uniformly connected to the agents. The presence of such a hub guarantees the achievement of an asymptotic consensus. For the “evaporation” of the dummy agent, the strength of its influences on the other agents vanishes, which leads to the concept of latent consensus. We obtain a closed-form expression for the consensus when the connections of the hub are symmetric; in this case, the impact of the hub upon the consensus remains fixed. On the other hand, if the hub is essentially influenced by the agents, whereas its influence on them tends to zero, then the consensus is expressed by the scalar product of the vector of column means of the Laplacian eigenprojection of Γ and the initial state vector of the system. Another protocol, which assumes the presence of vanishingly weak uniform background links between the agents, leads to the same latent consensus.  相似文献   

19.
Two mobile agents, starting from different nodes of an unknown network, have to meet at a node. Agents move in synchronous rounds using a deterministic algorithm. Each agent has a different label, which it can use in the execution of the algorithm, but it does not know the label of the other agent. Agents do not know any bound on the size of the network. In each round an agent decides if it remains idle or if it wants to move to one of the adjacent nodes. Agents are subject to delay faults: if an agent incurs a fault in a given round, it remains in the current node, regardless of its decision. If it planned to move and the fault happened, the agent is aware of it. We consider three scenarios of fault distribution: random (independently in each round and for each agent with constant probability \(0<p<1\)), unbounded adversarial (the adversary can delay an agent for an arbitrary finite number of consecutive rounds) and bounded adversarial (the adversary can delay an agent for at most c consecutive rounds, where c is unknown to the agents). The quality measure of a rendezvous algorithm is its cost, which is the total number of edge traversals. For random faults, we show an algorithm with cost polynomial in the size n of the network and polylogarithmic in the larger label L, which achieves rendezvous with very high probability in arbitrary networks. By contrast, for unbounded adversarial faults we show that rendezvous is not possible, even in the class of rings. Under this scenario we give a rendezvous algorithm with cost \(O(n\ell )\), where \(\ell \) is the smaller label, working in arbitrary trees, and we show that \(\varOmega (\ell )\) is the lower bound on rendezvous cost, even for the two-node tree. For bounded adversarial faults, we give a rendezvous algorithm working for arbitrary networks, with cost polynomial in n, and logarithmic in the bound c and in the larger label L.  相似文献   

20.
We consider the scheduling problem in which two agents (agents A and B), each having its own job set (containing the A-jobs and B-jobs, respectively), compete to process their own jobs in a two-machine flowshop. Each agent wants to maximize a certain criterion depending on the completion times of its jobs only. Specifically, agent A desires to maximize either the weighted number of just-in-time (JIT) A-jobs that are completed exactly on their due dates or the maximum weight of the JIT A-jobs, while agent B wishes to maximize the weighted number of JIT B-jobs. Evidently four optimization problems can be formulated by treating the two agents’ criteria as objectives and constraints of the corresponding optimization problems. We focus on the problem of finding the Pareto-optimal schedules and present a bicriterion analysis of the problem. Solving this problem also solves the other three problems of bicriterion scheduling as a by-product. We show that the problems under consideration are either polynomially or pseudo-polynomially solvable. In addition, for each pseudo-polynomial-time solution algorithm, we show how to convert it into a two-dimensional fully polynomial-time approximation scheme for determining an approximate Pareto-optimal schedule. Finally, we conduct extensive numerical studies to evaluate the performance of the proposed algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号