首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the learning problem faced by two self-interested agents repeatedly playing a general-sum stage game. We assume that the players can observe each other’s actions but not the payoffs received by the other player. The concept of Nash Equilibrium in repeated games provides an individually rational solution for playing such games and can be achieved by playing the Nash Equilibrium strategy for the single-shot game in every iteration. Such a strategy, however can sometimes lead to a Pareto-Dominated outcome for games like Prisoner’s Dilemma. So we prefer learning strategies that converge to a Pareto-Optimal outcome that also produces a Nash Equilibrium payoff for repeated two-player, n-action general-sum games. The Folk Theorem enable us to identify such outcomes. In this paper, we introduce the Conditional Joint Action Learner (CJAL) which learns the conditional probability of an action taken by the opponent given its own actions and uses it to decide its next course of action. We empirically show that under self-play and if the payoff structure of the Prisoner’s Dilemma game satisfies certain conditions, a CJAL learner, using a random exploration strategy followed by a completely greedy exploitation technique, will learn to converge to a Pareto-Optimal solution. We also show that such learning will generate Pareto-Optimal payoffs in a large majority of other two-player general sum games. We compare the performance of CJAL with that of existing algorithms such as WOLF-PHC and JAL on all structurally distinct two-player conflict games with ordinal payoffs.  相似文献   

2.
It is well-known that the phenomenon of entanglement plays a fundamental role in quantum game theory. Occasionally, games constructed via maximally entangled initial states (MEIS) will have new Nash equilibria yielding to the players higher payoffs than the ones they receive in the classical version of the game. When examining these new games for Nash equilibrium payoffs, a fundamental question arises; does a suitable choice of an MEIS improve the lot of the players? In this paper, we show that the answer to this question is yes for at least the case of a variant of the well-known two player, two strategy game of Chicken. To that end, we generalize Landsburg’s quaternionic representation of the payoff function of two player, two strategy maximally entangled states to games where the initial state is chosen arbitrarily from a circle of maximally entangled initial states and for the corresponding quantized games show the existence of superior Nash equilibrium payoffs when an MEIS is appropriately chosen.  相似文献   

3.
This paper considers models of evolutionary non-zero-sum games on the infinite time interval. Methods of differential game theory are used for the analysis of game interactions between two groups of participants. We assume that participants in these groups are controlled by signals for the behavior change. The payoffs of coalitions are defined as average integral functionals on the infinite horizon. We pose the design problem of a dynamical Nash equilibrium for the evolutionary game under consideration. The ideas and approaches of non-zero-sum differential games are employed for the determination of the Nash equilibrium solutions. The results derived in this paper involve the dynamic constructions and methods of evolutionary games. Much attention is focused on the formation of the dynamical Nash equilibrium with players strategies that maximize the corresponding payoff functions and have the guaranteed properties according to the minimax approach. An application of the minimax approach for constructing optimal control strategies generates dynamical Nash equilibrium trajectories yielding better results in comparison to static solutions and evolutionary models with the replicator dynamics. Finally, we make a comparison of the dynamical Nash equilibrium trajectories for evolutionary games with the average integral payoff functionals and the trajectories for evolutionary games with the global terminal payoff functionals on the infinite horizon.  相似文献   

4.
In this paper, a Cournot game in an oligopolistic market with incomplete information is considered. The market consists of some producers that compete for getting higher payoffs. For optimal decision making, each player needs to estimate its rivals’ behaviors. This estimation is carried out using linear regression and recursive weighted least-squares method. As the information of each player about its rivals increases during the game, its estimation of their reaction functions becomes more accurate. Here, it is shown that by choosing appropriate regressors for estimating the strategies of other players at each time-step of the market and using them for making the next step decision, the game will converge to its Nash equilibrium point. The simulation results for an oligopolistic market show the effectiveness of the proposed method.  相似文献   

5.
In this paper, by simulations on an artificial social model, we analyze cooperative behavior of agents playing the prisoner's dilemma game, in which each of the agents has the two strategies: cooperate and defect. Because defect yields a better payoff whichever strategy an opponent chooses, it is rational for an agent to choose defect in a single game or a finite number of games. However, it is known that a pair of cooperates can also be a Nash equilibrium pair if the players do not know when the game is over or the game is infinitely repeated. To investigate such cooperative behavior, we employ an artificial social model called the Sugarscape and carry out simulations on the model. Arranging three kinds of environments in the Sugarscape, we examine cooperative behavior of agents who are essentially selfish, in a sense that they maximize their payoffs, and investigate influence of environmental changes on the cooperative behavior.  相似文献   

6.
In game theory, an Evolutionarily Stable Set (ES set) is a set of Nash Equilibrium (NE) strategies that give the same payoffs. Similar to an Evolutionarily Stable Strategy (ES strategy), an ES set is also a strict NE. This work investigates the evolutionary stability of classical and quantum strategies in the quantum penny flip games. In particular, we developed an evolutionary game theory model to conduct a series of simulations where a population of mixed classical strategies from the ES set of the game were invaded by quantum strategies. We found that when only one of the two players’ mixed classical strategies were invaded, the results were different. In one case, due to the interference phenomenon of superposition, quantum strategies provided more payoff, hence successfully replaced the mixed classical strategies in the ES set. In the other case, the mixed classical strategies were able to sustain the invasion of quantum strategies and remained in the ES set. Moreover, when both players’ mixed classical strategies were invaded by quantum strategies, a new quantum ES set was emerged. The strategies in the quantum ES set give both players payoff 0, which is the same as the payoff of the strategies in the mixed classical ES set of this game.  相似文献   

7.
广义Nash平衡点和切换控制在对策论中的应用   总被引:2,自引:2,他引:0  
通过把平衡点和决策者的动机耦合的方法,提出了广义纳什平衡点这一新概念.决策者的动机通常有两类:一是最大化自己的利益,另一则是最大化对手的利益.如果每一个决策者的动机都是第一类,一个理性的群体就会形成,整个系统最终会达到第一类平衡点(也就是经典的纳什平衡点).如果每一个决策者的动机都是第二类,一个有智慧的群体就会形成,整个系统最终会达到第二类平衡点.同时,切换控制被用来帮助决策者确定他们的动机.  相似文献   

8.
Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each motorist may myopically optimize his or her own utility as a function of road price and collectively communicate with his or her friends and neighbors on traffic situation at the same time. This paper considers the road pricing design by using game theory and consensus theory. For the case where a system supervisor broadcasts information on the overall system to each agent, we present a variant of standard fictitious play called average strategy fictitious play (ASFP) for large-scale repeated congestion games. Only a weighted running average of all other players' actions is assumed to be available to each player. The ASFP reduces the burden of both information gathering and information processing for each player. Compared to the joint strategy fictitious play (JSFP) studied in the literature, the updating process of utility functions for each player is avoided. We prove that there exists at least one pure strategy Nash equilibrium for the congestion game under investigation, and the players' actions generated by the ASFP with inertia (players' reluctance to change their previous actions) converge to a Nash equilibrium almost surely. For the case without broadcasting, a consensus protocol is introduced for individual agents to estimate the percentage of players choosing each resource, and the convergence property of players' action profile is still ensured. The results are applied to road pricing design to achieve socially local optimal trip timing. Simulation results are provided based on the real traffic data for the Singapore case study.   相似文献   

9.
In game theory, it is usually assumed that each player has only one payoff function and the strategy set of the game is composed of the topological product of individual players’ strategy sets. In real business and system design or control problems, however, players’ strategy sets may be interactive and each player may have more than one payoff function. This paper, investigates the more general situation of multiple payoff and multiple person games in a normal form. In this paper, each player has several payoff functions which are dominated by certain convex cones, and the feasible strategy set of each player may be interactive with those of the other players. This new model is applied to a classical example without requiring variational and quasi-variational inequalities, or point-to-set mappings.  相似文献   

10.
The Nash and Stackelberg strategies of a nonzero sum game have the common property that they are both noncooperative equilibrium solutions for which no player can achieve an improvement in his performance if he attempts to deviate from his strategy (cheat). In this note we show that the Nash solution is desirable only if it is not dominated by any of the Stackelberg solutions. Otherwise a Stackelberg strategy is always more favorable to both players and, as the Nash solution, it can be enforced once an agreement between the players, specifying the leader and the follower, is reached.  相似文献   

11.
Psychological experiment studies reveal that human interaction behaviors are often not the same as what game theory predicts. One of important reasons is that they did not put relevant constraints into consideration when the players choose their best strategies. However, in real life, games are often played in certain contexts where players are constrained by their capabilities, law, culture, custom, and so on. For example, if someone wants to drive a car, he/she has to have a driving license. Therefore, when a human player of a game chooses a strategy, he/she should consider not only the material payoff or monetary reward from taking his/her best strategy and others' best responses but also how feasible to take the strategy in that context where the game is played. To solve such a game, this paper establishes a model of fuzzily constrained games and introduces a solution concept of constrained equilibrium for the games of this kind. Our model is consistent with psychological experiment results of ultimatum games. We also discuss what will happen if Prisoner's Dilemma and Stag Hunt are played under fuzzy constraints. In general, after putting constraints into account, our model can reflect well the human behaviors of fairness, altruism, self‐interest, and so on, and thus can predict the outcomes of some games more accurate than conventional game theory.  相似文献   

12.
We study the computational complexity of problems involving equilibria in strategic games and in perfect information extensive games when the number of players is large. We consider, among others, the problems of deciding the existence of a pure Nash equilibrium in strategic games or deciding the existence of a pure Nash or a subgame perfect Nash equilibrium with a given payoff in finite perfect information extensive games. We address the fundamental question of how can we represent a game with a large number of players? We propose three ways of representing a game with different degrees of succinctness for the components of the game. For perfect information extensive games we show that when the number of moves of each player is large and the input game is represented succinctly these problems are PSPACE-complete. In contraposition, when the game is described explicitly by means of its associated tree all these problems are decidable in polynomial time. For strategic games we show that the complexity of deciding the existence of a pure Nash equilibrium depends on the succinctness of the game representation and then on the size of the action sets. In particular we show that it is NP-complete, when the number of players is large and the number of actions for each player is constant, and that the problem is -complete when the number of players is a constant and the size of the action sets is exponential in the size of the game representation. Again when the game is described explicitly the problem is decidable in polynomial time.  相似文献   

13.
Networked noncooperative games are investigated, where each player (or agent) plays with all other players in its neighborhood. Assume the evolution is based on the fact that each player uses its neighbors' current information to decide its next strategy. By using sub-neighborhood, the dynamics of the evolution is obtained. Then a method for calculating Nash equilibriums from mixed strategies of multi-players is proposed. The relationship between local Nash equilibriums based on individual neighborhoods and global Nash equilibriums of overall network is revealed. Then a technique is proposed to construct Nash equilibriums of an evolutionary game from its one step static Nash equilibriums. The basic tool of this approach is the semi-tensor product of matrices, which converts strategies into logical matrices and payoffs into pseudo-Boolean functions, then networked evolutionary games become discrete time dynamic systems.   相似文献   

14.
为提高足球机器人在比赛中进攻中的成功率,通过分析足球机器人一些进攻策略算法的不足和足球机器人进攻的任务以及Nash均衡的主要特征,提出了一种基于博弈论足球机器人进攻策略算法。博弈的战略考虑射门和传球,通过获得的收益函数值选择最佳策略。实验结果表明,足球机器人能迅速合理选择进攻策略,有效地提高机器人在比赛中进攻中的成功率。  相似文献   

15.
Conflicts occur naturally in the real world at all levels of society, individually, in groups or society as a whole. Almost all the existing conflict resolution models are unilateral in their decision‐making process. They do not consider the actions of the involved parties simultaneously. Therefore, in this paper, we aim to design a novel conflict resolution model based on game‐theoretic rough sets by constructing a game between all the concerned parties (players), computing the payoff of different strategies and classifying them following equilibrium rules. The proposed model yields more realistic and accurate results as it explores all possibilities and is flexible in determining different threshold values relative to the complexities of real‐life problems. Three real‐life conflict situations are solved with the proposed model, and a comprehensive analysis is done to validate the effectiveness of the proposed approach.  相似文献   

16.
This paper proposes an object-level rate control algorithm to jointly controlling the bit rates of multiple video objects. Utilizing noncooperative game theory, the proposed rate control algorithm mimics the behaviors of players representing video objects. Each player competes for available bits to optimize its visual quality. The algorithm finds an “optimal solution” in that it conforms to the mixed strategy Nash equilibrium, which is the probability distribution of the actions carried by the players that maximizes their expected payoffs (the number of bits). The game is played iteratively, and the expected payoff of each play is accumulated. The game terminates when all of the available bits for the specific time instant have been distributed to video object planes (VOPs). The advantage of the proposed scheme is that the bidding objects divide the bits among themselves automatically and fairly, according to their encoding complexity, and with an overall solution that is strategically optimal under the given circumstances. To minimize buffer fluctuation and avoid buffer overflow and underflow, a proportional-integral-derivative (PID) control based buffer policy is utilized.   相似文献   

17.
A widely accepted rational behavior for non-cooperative players is based on the notion of Nash equilibrium. Although the existence of a Nash equilibrium is guaranteed in the mixed framework (i.e., when players select their actions in a randomized manner) in many real-world applications the existence of “any” equilibrium is not enough. Rather, it is often desirable to single out equilibria satisfying some additional requirements (in order, for instance, to guarantee a minimum payoff to certain players), which we call constrained Nash equilibria.In this paper, a formal framework for specifying these kinds of requirement is introduced and investigated in the context of graphical games, where a player p may directly be interested in some of the other players only, called the neighbors of p. This setting is very useful for modeling large population games, where typically each player does not directly depend on all the players, and representing her utility function extensively is either inconvenient or infeasible.Based on this framework, the complexity of deciding the existence and of computing constrained equilibria is then investigated, in the light of evidencing how the intrinsic difficulty of these tasks is affected by the requirements prescribed at the equilibrium and by the structure of players’ interactions. The analysis is carried out for the setting of mixed strategies as well as for the setting of pure strategies, i.e., when players are forced to deterministically choose the action to perform. In particular, for this latter case, restrictions on players’ interactions and on constraints are identified, that make the computation of Nash equilibria an easy problem, for which polynomial and highly-parallelizable algorithms are presented.  相似文献   

18.
In game theory the interaction among players obligates each player to develop a belief about the possible strategies of the other players, to choose a best-reply given those beliefs, and to look for an adjustment of the best-reply and the beliefs using a learning mechanism until they reach an equilibrium point. Usually, the behavior of an individual cost-function, when such best-reply strategies are applied, turns out to be non-monotonic and concluding that such strategies lead to some equilibrium point is a non-trivial task. Even in repeated games the convergence to a stationary equilibrium is not always guaranteed. The best-reply strategies analyzed in this paper represent the most frequent type of behavior applied in practice in problems of bounded rationality of agents considered within the Artificial Intelligence research area. They are naturally related with the, so-called, fixed-local-optimal actions or, in other words, with one step-ahead optimization algorithms widely used in the modern Intelligent Systems theory.This paper shows that for an ergodic class of finite controllable Markov games the best-reply strategies lead necessarily to a Lyapunov/Nash equilibrium point. One of the most interesting properties of this approach is that an expedient (or absolutely expedient) behavior of an ergodic system (repeated game) can be represented by a Lyapunov-like function non-decreasing in time. We present a method for constructing a Lyapunov-like function: the Lyapunov-like function replaces the recursive mechanism with the elements of the ergodic system that model how players are likely to behave in one-shot games. To show our statement, we first propose a non-converging state-value function that fluctuates (increases and decreases) between states of the Markov game. Then, we prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-local-optimal strategy. As a result, we prove that a Lyapunov-like function can be built using the previous recursive expression for the Markov game, i.e., the resulting Lyapunov-like function is a monotonic function which can only decrease (or remain the same) over time, whatever the initial distribution of probabilities. As a result, a new concept called Lyapunov games is suggested for a class of repeated games. Lyapunov games allow to conclude during the game whether the applied strategy provides the convergence to an equilibrium point (or not). The time for constructing a Potential (Lyapunov-like) function is exponential. Our algorithm tractably computes the Nash, Lyapunov and the correlated equilibria: a Lyapunov equilibrium is a Nash equilibrium, as well it is also a correlated equilibrium. Validity of the proposed method is successfully demonstrated both theoretically and practically by a simulated experiment related to the Duel game.  相似文献   

19.
Social networks are fundamental mediums for diffusion of information and contagions appear at some node of the network and get propagated over the edges. Prior researches mainly focus on each contagion spreading independently, regardless of multiple contagions’ interactions as they propagate at the same time. In the real world, simultaneous news and events usually have to compete for user’s attention to get propagated. In some other cases, they can cooperate with each other and achieve more influences.In this paper, an evolutionary game theoretic framework is proposed to model the interactions among multiple contagions. The basic idea is that different contagions in social networks are similar to the multiple organisms in a population, and the diffusion process is as organisms interact and then evolve from one state to another. This framework statistically learns the payoffs as contagions interacting with each other and builds the payoff matrix. Since learning payoffs for all pairs of contagions IS almost impossible (quadratic in the number of contagions), a contagion clustering method is proposed in order to decrease the number of parameters to fit, which makes our approach efficient and scalable. To verify the proposed framework, we conduct experiments by using real-world information spreading dataset of Digg. Experimental results show that the proposed game theoretic framework helps to comprehend the information diffusion process better and can predict users’ forwarding behaviors with more accuracy than the previous studies. The analyses of evolution dynamics of contagions and evolutionarily stable strategy reveal whether a contagion can be promoted or suppressed by others in the diffusion process.  相似文献   

20.
We outline the general construction of three-player games with incomplete information which fulfil the following conditions: (i) symmetry with respect to the permutations of players; (ii) the existence of an upper bound for total payoff resulting from Bell inequalities; (iii) the existence of both fair and unfair Nash equilibria saturating this bound. Conditions (i)–(iii) imply that we are dealing with conflicting interest games. An explicit example of such a game is given. A quantum counterpart of this game is considered. It is obtained by keeping the same utilities but replacing classical advisor by a quantum one. It is shown that the quantum game possesses only fair equilibria with strictly higher payoffs than in the classical case. This implies that quantum nonlocality can be used to resolve the conflict between the players.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号