首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents a novel approach to the facility layout design problem based on multi-agent society where agents’ interactions form the facility layout design. Each agent corresponds to a facility with inherent characteristics, emotions, and a certain amount of money, forming its utility function. An agent’s money is adjusted during the learning period by a manager agent while each agent tries to tune the parameters of its utility function in such a way that its total layout cost can be minimized in competition with others. The agents’ interactions are formed based on market mechanism. In each step, an unoccupied location is presented to all applicant agents, for which each agent proposes a price proportionate to its utility function. The agent proposing a higher price is selected as the winner and assigned to that location by an appropriate space-filling curve. The proposed method utilizes the fuzzy theory to establish each agent’s utility function. In addition, it provides a simulation environment using an evolutionary algorithm to form different interactions among the agents and makes it possible for them to experience various strategies. The experimental results show that the proposed approach achieves a lower total layout cost compared with state of the art methods.  相似文献   

2.
Reinforcement learning techniques like the Q-Learning one as well as the Multiple-Lookahead-Levels one that we introduced in our prior work require the agent to complete an initial exploratory path followed by as many hypothetical and physical paths as necessary to find the optimal path to the goal. This paper introduces a reinforcement learning technique that uses a distance measure to the goal as a primary gauge for an autonomous agent’s action selection. In this paper, we take advantage of the first random walk to acquire initial information about the goal. Once the agent’s goal is reached, the agent’s first perceived internal model of the environment is updated to reflect and include said goal. This is done by the agent tracing back its steps to its origin starting point. We show in this paper, no exploratory or hypothetical paths are required after the goal is initially reached or detected, and the agent requires a maximum of two physical paths to find the optimal path to the goal. The agent’s state occurrence frequency is introduced as well and used to support the proposed Distance-Only technique. A computation speed performance analysis is carried out, and the Distance-and-Frequency technique is shown to require less computation time than the Q-Learning one. Furthermore, we present and demonstrate how multiple agents using the Distance-and-Frequency technique can share knowledge of the environment and study the effect of that knowledge sharing on the agents’ learning process.  相似文献   

3.
Organisational adaptation of multi-agent systems in a peer-to-peer scenario   总被引:2,自引:1,他引:1  
Organisations in multi-agent systems (MAS) have proven to be successful in regulating agent societies. Nevertheless, changes in agents’ behaviour or in the dynamics of the environment may lead to a poor fulfilment of the system’s purposes, and so the entire organisation needs to be adapted. In this paper we focus on endowing the organisation with adaptation capabilities, instead of expecting agents to be capable of adapting the organisation by themselves. We regard this organisational adaptation as an assisting service provided by what we call the Assistance Layer. Our generic Two Level Assisted MAS Architecture (2-LAMA) incorporates such a layer. We empirically evaluate this approach by means of an agent-based simulator we have developed for the P2P sharing network domain. This simulator implements 2-LAMA architecture and supports the comparison between different adaptation methods, as well as, with the standard BitTorrent protocol. In particular, we present two alternatives to perform norm adaptation and one method to adapt agents’ relationships. The results show improved performance and demonstrate that the cost of introducing an additional layer in charge of the system’s adaptation is lower than its benefits.  相似文献   

4.
This paper adds temporal logic to public announcement logic (PAL) and dynamic epistemic logic (DEL). By adding a previous-time operator to PAL, we express in the language statements concerning the muddy children puzzle and sum and product. We also express a true statement that an agent’s beliefs about another agent’s knowledge flipped twice, and use a sound proof system to prove this statement. Adding a next-time operator to PAL, we provide formulas that express that belief revision does not take place in PAL. We also discuss relationships between announcements and the new knowledge agents thus acquire; such relationships are related to learning and to Fitch’s paradox. We also show how inverse programs and hybrid logic each can be used to help determine whether or not an arbitrary structure represents the play of a game. We then add a past-time operator to DEL, and discuss the importance of adding yet another component to the language in order to prove completeness.  相似文献   

5.
In this paper, we introduce a game-theoretic framework to address the community detection problem based on the structures of social networks. We formulate the dynamics of community formation as a strategic game called community formation game: Given an underlying social graph, we assume that each node is a selfish agent who selects communities to join or leave based on her own utility measurement. A community structure can be interpreted as an equilibrium of this game. We formulate the agents’ utility by the combination of a gain function and a loss function. We allow each agent to select multiple communities, which naturally captures the concept of “overlapping communities”. We propose a gain function based on the modularity concept introduced by Newman (Proc Natl Acad Sci 103(23):8577–8582, 2006), and a simple loss function that reflects the intrinsic costs incurred when people join the communities. We conduct extensive experiments under this framework, and our results show that our algorithm is effective in identifying overlapping communities, and are often better then other algorithms we evaluated especially when many people belong to multiple communities. To the best of our knowledge, this is the first time the community detection problem is addressed by a game-theoretic framework that considers community formation as the result of individual agents’ rational behaviors.  相似文献   

6.
Adaptive game AI with dynamic scripting   总被引:1,自引:0,他引:1  
  相似文献   

7.
The ability to analyze the effectiveness of agent reward structures is critical to the successful design of multiagent learning algorithms. Though final system performance is the best indicator of the suitability of a given reward structure, it is often preferable to analyze the reward properties that lead to good system behavior (i.e., properties promoting coordination among the agents and providing agents with strong signal to noise ratios). This step is particularly helpful in continuous, dynamic, stochastic domains ill-suited to simple table backup schemes commonly used in TD(λ)/Q-learning where the effectiveness of the reward structure is difficult to distinguish from the effectiveness of the chosen learning algorithm. In this paper, we present a new reward evaluation method that provides a visualization of the tradeoff between the level of coordination among the agents and the difficulty of the learning problem each agent faces. This method is independent of the learning algorithm and is only a function of the problem domain and the agents’ reward structure. We use this reward property visualization method to determine an effective reward without performing extensive simulations. We then test this method in both a static and a dynamic multi-rover learning domain where the agents have continuous state spaces and take noisy actions (e.g., the agents’ movement decisions are not always carried out properly). Our results show that in the more difficult dynamic domain, the reward efficiency visualization method provides a two order of magnitude speedup in selecting good rewards, compared to running a full simulation. In addition, this method facilitates the design and analysis of new rewards tailored to the observational limitations of the domain, providing rewards that combine the best properties of traditional rewards.  相似文献   

8.
We consider the learning problem faced by two self-interested agents repeatedly playing a general-sum stage game. We assume that the players can observe each other’s actions but not the payoffs received by the other player. The concept of Nash Equilibrium in repeated games provides an individually rational solution for playing such games and can be achieved by playing the Nash Equilibrium strategy for the single-shot game in every iteration. Such a strategy, however can sometimes lead to a Pareto-Dominated outcome for games like Prisoner’s Dilemma. So we prefer learning strategies that converge to a Pareto-Optimal outcome that also produces a Nash Equilibrium payoff for repeated two-player, n-action general-sum games. The Folk Theorem enable us to identify such outcomes. In this paper, we introduce the Conditional Joint Action Learner (CJAL) which learns the conditional probability of an action taken by the opponent given its own actions and uses it to decide its next course of action. We empirically show that under self-play and if the payoff structure of the Prisoner’s Dilemma game satisfies certain conditions, a CJAL learner, using a random exploration strategy followed by a completely greedy exploitation technique, will learn to converge to a Pareto-Optimal solution. We also show that such learning will generate Pareto-Optimal payoffs in a large majority of other two-player general sum games. We compare the performance of CJAL with that of existing algorithms such as WOLF-PHC and JAL on all structurally distinct two-player conflict games with ordinal payoffs.  相似文献   

9.
We suggest that developing automata theoretic foundations is relevant for knowledge theory, so that we study not only what is known by agents, but also the mechanisms by which such knowledge is arrived at. We define a class of epistemic automata, in which agents’ local states are annotated with abstract knowledge assertions about others. These are finite state agents who communicate synchronously with each other and information exchange is ‘perfect’. We show that the class of recognizable languages has good closure properties, leading to a Kleene-type theorem using what we call regular knowledge expressions. These automata model distributed causal knowledge in the following way: each agent in the system has a partial knowledge of the temporal evolution of the system, and every time agents synchronize, they update each other’s knowledge, resulting in a more up-to-date view of the system state. Hence we show that these automata can be used to solve the satisfiability problem for a natural epistemic temporal logic for local properties. Finally, we characterize the class of languages recognized by epistemic automata as the regular consistent languages studied in concurrency theory.  相似文献   

10.
We present a probabilistic model of user affect designed to allow an intelligent agent to recognise multiple user emotions during the interaction with an educational computer game. Our model is based on a probabilistic framework that deals with the high level of uncertainty involved in recognizing a variety of user emotions by combining in a Dynamic Bayesian Network information on both the causes and effects of emotional reactions. The part of the framework that reasons from causes to emotions (diagnostic model) implements a theoretical model of affect, the OCC model, which accounts for how emotions are caused by one’s appraisal of the current context in terms of one’s goals and preferences. The advantage of using the OCC model is that it provides an affective agent with explicit information not only on which emotions a user feels but also why, thus increasing the agent’s capability to effectively respond to the users’ emotions. The challenge is that building the model requires having mechanisms to assess user goals and how the environment fits them, a form of plan recognition. In this paper, we illustrate how we built the predictive part of the affective model by combining general theories with empirical studies to adapt the theories to our target application domain. We then present results on the model’s accuracy, showing that the model achieves good accuracy on several of the target emotions. We also discuss the model’s limitations, to open the ground for the next stage of the work, i.e., complementing the model with diagnostic information.
Heather MaclarenEmail:
  相似文献   

11.
In modern computer games, "bots" - intelligent realistic agents play a prominent role in the popularity of a game in the market. Typically, bots are modeled using finite-state machine and then programmed via simple conditional statements which are hard-coded in bots logic. Since these bots have become quite predictable to an experienced games player, a player might lose interest in the game. We propose the use of a game theoretic based learning rule called fictitious play for improving behavior of these computer game bots which will make them less predictable and hence, more a enjoyable game.  相似文献   

12.
Open distributed systems pose a challenge to trust modelling due to the dynamic nature of these systems (e.g., electronic auctions) and the unreliability of self-interested agents. The majority of trust models implicitly assume a shared cognitive model for all the agents participating in a society, and thus they treat the discrepancy between information and experience as a source of distrust: if an agent states a given quality of service, and another agent experiences a different quality for that service, such discrepancy is typically assumed to indicate dishonesty, and thus trust is reduced. Herein, we propose a trust model, which does not assume a concrete cognitive model for other agents, but instead uses the discrepancy between the information about other agents and its own experience to better predict the behavior of the others. This neutrality about other agents’ cognitive models allows an agent to obtain utility from lyres or agents having a different model of the world. The experiments performed suggest that this model improves the performance of an agent in dynamic scenarios under certain conditions such as those found in market-like evolving environments.  相似文献   

13.
We investigate a simulated multi-agent system (MAS) that collectively decides to aggregate at an area of high utility. The agents’ control algorithm is based on random agent–agent encounters and is inspired by the aggregation behavior of honeybees. In this article, we define symmetry breaking, several symmetry breaking measures, and report the phenomenon of emergent symmetry breaking within our observed system. The ability of the MAS to successfully break the symmetry depends significantly on a local-neighborhood-based threshold of the agents’ control algorithm that determines at which number of neighbors the agents stop. This dependency is analyzed and two macroscopic features are determined that significantly influence the symmetry breaking behavior. In addition, we investigate the connection between the ability of the MAS to break symmetries and the ability to stay flexible in a dynamic environment.  相似文献   

14.
The purpose of this article is to gain knowledge about how interactions in a gaming context become constituted as effective resources for a student’s learning trajectory. In addition, this detailed study of a learning trajectory documents how a computer game becomes a learning resource for working on a specific topic in school. The article reports on a qualitative study of students at an upper secondary school who have played the computer game Global Conflicts: Palestine to learn about the complexity of the Israeli-Palestinian conflict. A sociocultural and dialogic approach to learning and meaning-making is employed as an analytical framework. Analyzing different interactional episodes, in which important orientations and reorientations are located, documents how the student’s learning trajectory developed and changed during the project. When engaged in game play in educational settings, experiences with playing computer games outside of school can relevantly be invoked and become part of the collaborative project of finding out how to play the game. However, these ways of engaging with a computer game might not necessarily facilitate a subtle understanding of the specific topic that is addressed in the game. The findings suggest that the constitution of a computer game as a learning resource is a collaborative project, in which multiple resources for meaning-making are in play, and for which the teacher has an important role in facilitating student’s adoption of a multiperspective on the conflict. Furthermore, the findings shed light on what characterizes student-teacher interactions that contribute to a subtle understanding, and offer a framework for important issues upon which to reflect in game-based learning (GBL).  相似文献   

15.
A multiagent framework for coordinated parallel problem solving   总被引:1,自引:1,他引:0  
Today’s organizations, under increasing pressure on the effectiveness and the increasing need for dealing with complex tasks beyond a single individual’s capabilities, need technological support in managing complex tasks that involve highly distributed and heterogeneous information sources and several actors. This paper describes CoPSF, a multiagent system middle-ware that simplifies the development of coordinated problem solving applications while ensuring standard compliance through a set of system services and agents. CoPSF hosts and serves multiple concurrent teams of problem solving contributing both to the limitation of communication overheads and to the reduction of redundant work across teams and organizations. The framework employs (i) an interleaved task decomposition and allocation approach, (ii) a mechanism for coordination of agents’ work, and (iii) a mechanism that enables synergy between parallel teams.  相似文献   

16.
Computational models of emotions have been thriving and increasingly popular since the 1990s. Such models used to be concerned with the emotions of individual agents when they interact with other agents. Out of the array of models for the emotions, we are going to devote special attention to the approach in Adamatzky’s Dynamics of Crowd-Minds. The reason it stands out, is that it considers the crowd, rather than the individual agent. It fits in computational intelligence. It works by mathematical simulation on a crowd of simple artificial agents: by letting the computer program run, the agents evolve, and crowd behaviour emerges. Adamatzky’s purpose is to give an account of the emergence of allegedly “irrational” behaviour. This is not without problem, as the irrational to one person may seem entirely rational to another, and this in turn is an insight that, in the history of crowd psychology, has affected indeed the competition among theories of crowd dynamics. Quite importantly, Adamatzky’s book argues for the transition from individual agencies to a crowd’s or a mob’s coalesced mind as so, and at any rate for coalesced crowd’s agency.  相似文献   

17.
We describe a relational learning by observation framework that automatically creates cognitive agent programs that model expert task performance in complex dynamic domains. Our framework uses observed behavior and goal annotations of an expert as the primary input, interprets them in the context of background knowledge, and returns an agent program that behaves similar to the expert. We map the problem of creating an agent program on to multiple learning problems that can be represented in a “supervised concept learning’’ setting. The acquired procedural knowledge is partitioned into a hierarchy of goals and represented with first order rules. Using an inductive logic programming (ILP) learning component allows our framework to naturally combine structured behavior observations, parametric and hierarchical goal annotations, and complex background knowledge. To deal with the large domains we consider, we have developed an efficient mechanism for storing and retrieving structured behavior data. We have tested our approach using artificially created examples and behavior observation traces generated by AI agents. We evaluate the learned rules by comparing them to hand-coded rules. Editor: Rui Camacho  相似文献   

18.
This paper presents a methodology for the coordination of multiple robotic agents moving from one location to another in an environment embedded with a network of agents, placed at strategic locations such as intersections. These intersection agents, communicate with robotic agents and also with each other to route robots in a way as to minimize the congestion, thus resulting in the continuous flow of robot traffic. A robot’s path to its destination is computed by the network (in this paper, ‘Network’ refers to the collection of ‘Network agents’ operating at the intersections) in terms of the next waypoints to reach. The intersection agents are capable of identifying robots in their proximity based on signal strength. An intersection agent controls the flow of agent traffic around it with the help of the data it collects from the messages received from the robots and other surrounding intersection agents. The congestion of traffic is reduced using a two-layered hierarchical strategy. The primary layer operates at the intersection to reduce the time delay of robots crossing them. The secondary layer maintains coordination between intersection agents and routes traffic such that delay is reduced through effective load balancing. The objective at the primary level, to reduce congestion at the intersection, is achieved through assigning priorities to pathways leading to the intersection based on the robot traffic density. At the secondary level, the load balancing of robots over multiple intersections is achieved through coordination between intersection agents by communication of robot densities in different pathways. Extensive comparisons show the performance gain of the current method over existing ones. Theoretical analysis apart from simulation show the advantages of load-balanced traffic flow over uncoordinated allotment of robotic agents to pathways. Transferring the burden of coordination to the network releases more computational power for the robots to engage in critical assistive activities.  相似文献   

19.
We describe our development of Cobot, a novel software agent who lives in LambdaMOO, a popular virtual world frequented by hundreds of users. Cobot’s goal was to become an actual part of that community. Here, we present a detailed discussion of the functionality that made him one of the objects most frequently interacted with in LambdaMOO, human or artificial. Cobot’s fundamental power is that he has the ability to collect social statistics summarizing the quantity and quality of interpersonal interactions. Initially, Cobot acted as little more than a reporter of this information; however, as he collected more and more data, he was able to use these statistics as models that allowed him to modify his own behavior. In particular, cobot is able to use this data to “self-program,” learning the proper way to respond to the actions of individual users, by observing how others interact with one another. Further, Cobot uses reinforcement learning to proactively take action in this complex social environment, and adapts his behavior based on multiple sources of human reward. Cobot represents a unique experiment in building adaptive agents who must live in and navigate social spaces.  相似文献   

20.
In his seminal work, Harsanyi (Manag. Sci. 14, 159–182, 320–332, 468–502, 1967) introduced an elegant approach to study non-cooperative games with incomplete information. In our work, we use this approach to define a new selfish routing game with incomplete information that we call Bayesian routing game. Here, each of n selfish users wishes to assign its traffic to one of m parallel links. However, users do not know each other’s traffic. Following Harsanyi’s approach, we introduce, for each user, a set of possible types. In our model, each type of a user corresponds to some traffic and the players’ uncertainty about each other’s traffic is described by a probability distribution over all possible type profiles. We present a comprehensive collection of results about our Bayesian routing game. Our main findings are as follows:
•  Using a potential function, we prove that every Bayesian routing game has a pure Bayesian Nash equilibrium. More precisely, we show this existence for a more general class of games that we call weighted Bayesian congestion games. For Bayesian routing games with identical links and independent type distribution, we give a polynomial time algorithm to compute a pure Bayesian Nash equilibrium.
•  We study structural properties of fully mixed Bayesian Nash equilibria for the case of identical links and show that they maximize Individual Cost. In general, there is more than one fully mixed Bayesian Nash equilibrium. We characterize fully mixed Bayesian Nash equilibria for the case of independent type distribution.
•  We conclude with bounds on Coordination Ratio for the case of identical links and for three different Social Cost measures: Expected Maximum Latency, Sum of Individual Costs and Maximum Individual Cost. For the latter two, we are able to give (asymptotically) tight bounds using the properties of fully mixed Bayesian Nash equilibria we proved.
This work has been partially supported by the DFG-SFB 376 and by the European Union within the 6th Framework Programme under contract 001907 ( ). A preliminary version of this paper appeared in the Proceedings of the 17th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 203–212, July 2005.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号