首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each motorist may myopically optimize his or her own utility as a function of road price and collectively communicate with his or her friends and neighbors on traffic situation at the same time. This paper considers the road pricing design by using game theory and consensus theory. For the case where a system supervisor broadcasts information on the overall system to each agent, we present a variant of standard fictitious play called average strategy fictitious play (ASFP) for large-scale repeated congestion games. Only a weighted running average of all other players' actions is assumed to be available to each player. The ASFP reduces the burden of both information gathering and information processing for each player. Compared to the joint strategy fictitious play (JSFP) studied in the literature, the updating process of utility functions for each player is avoided. We prove that there exists at least one pure strategy Nash equilibrium for the congestion game under investigation, and the players' actions generated by the ASFP with inertia (players' reluctance to change their previous actions) converge to a Nash equilibrium almost surely. For the case without broadcasting, a consensus protocol is introduced for individual agents to estimate the percentage of players choosing each resource, and the convergence property of players' action profile is still ensured. The results are applied to road pricing design to achieve socially local optimal trip timing. Simulation results are provided based on the real traffic data for the Singapore case study.   相似文献   

2.
Sampled fictitious play (SFP) is a recently proposed iterative learning mechanism for computing Nash equilibria of non-cooperative games. For games of identical interests, every limit point of the sequence of mixed strategies induced by the empirical frequencies of best response actions that players in SFP play is a Nash equilibrium. Because discrete optimization problems can be viewed as games of identical interests wherein Nash equilibria define a type of local optimum, SFP has recently been employed as a heuristic optimization algorithm with promising empirical performance. However, there have been no guarantees of convergence to a globally optimal Nash equilibrium established for any of the problem classes considered to date. In this paper, we introduce a variant of SFP and show that it converges almost surely to optimal policies in model-free, finite-horizon stochastic dynamic programs. The key idea is to view the dynamic programming states as players, whose common interest is to maximize the total multi-period expected reward starting in a fixed initial state. We also offer empirical results suggesting that our SFP variant is effective in practice for small to moderate sized model-free problems.  相似文献   

3.
We study the convergence times of dynamics in games involving graphical relationships of players. Our model of interaction games generalizes a variety of recently studied games in game theory and distributed computing. In a local interaction games each agent is a node embedded in a graph and plays the same 2-player game with each neighbor. He can choose his strategy only once and must apply his choice in each 2-player game he is involved in. This represents a fundamental model of decision making with local interaction and distributed control. Furthermore, we introduce a generalization called 2-type interaction games, in which one 2-player game is played on edges and possibly another game is played on non-edges. For the popular case with symmetric 2 ×?2 games, we show that several dynamics converge to a pure Nash equilibrium in polynomial time. This includes arbitrary sequential better-response dynamics, as well as concurrent dynamics resulting from a distributed protocol that does not rely on global knowledge. We supplement these results with an experimental comparison of sequential and concurrent dynamics.  相似文献   

4.
We consider a continuous-time form of repeated matrix games in which player strategies evolve in reaction to opponent actions. Players observe each other's actions, but do not have access to other player utilities. Strategy evolution may be of the best response sort, as in fictitious play, or a gradient update. Such mechanisms are known to not necessarily converge. We introduce a form of "dynamic" fictitious and gradient play strategy update mechanisms. These mechanisms use derivative action in processing opponent actions and, in some cases, can lead to behavior converging to Nash equilibria in previously nonconvergent situations. We analyze convergence in the case of exact and approximate derivative measurements of the dynamic update mechanisms. In the ideal case of exact derivative measurements, we show that convergence to Nash equilibrium can always be achieved. In the case of approximate derivative measurements, we derive a characterization of local convergence that shows how the dynamic update mechanisms can converge if the traditional static counterparts do not. We primarily discuss two player games, but also outline extensions to multiplayer games. We illustrate these methods with convergent simulations of the well known Shapley and Jordan counterexamples.  相似文献   

5.
We provide a simple learning process that enables an agent to forecast a sequence of outcomes. Our forecasting scheme, termed tracking forecast, is based on tracking the past observations while emphasizing recent outcomes. As opposed to other forecasting schemes, we sacrifice universality in favor of a significantly reduced memory requirements. We show that if the sequence of outcomes has certain properties—it has some internal (hidden) state that does not change too rapidly—then the tracking forecast is weakly calibrated so that the forecast appears to be correct most of the time. For binary outcomes, this result holds without any internal state assumptions. We consider learning in a repeated strategic game where each player attempts to compute some forecast of the opponent actions and play a best response to it. We show that if one of the players uses a tracking forecast, while the other player uses a standard learning algorithm (such as exponential regret matching or smooth fictitious play), then the player using the tracking forecast obtains the best response to the actual play of the other players. We further show that if both players use tracking forecast, then under certain conditions on the game matrix, convergence to a Nash equilibrium is possible with positive probability for a larger class of games than the class of games for which smooth fictitious play converges to a Nash equilibrium.  相似文献   

6.
This paper examines strategic adaptation in participants’ behavior conditional on the type of their opponent. Participants played a constant-sum game for 100 rounds against each of three pattern-detecting computer algorithms designed to exploit regularities in human behavior such as imperfections in randomizing and the use of simple heuristics. Significant evidence is presented that human participants not only change their marginal probabilities of choosing actions, but also their conditional probabilities dependent on the recent history of play. A cognitive model incorporating pattern recognition is proposed that capture the shifts in strategic behavior of the participants better than the standard non-pattern detecting model employed in the literature, the Experience Weighted Attraction model (and by extension its nested models, reinforcement learning and fictitious play belief learning).  相似文献   

7.
This paper investigates the existence and convergence of weighted Nash equilibrium for incomplete-profile networked evolutionary games with multiple payoffs. First, the incomplete-profile networked evolutionary game under probabilistic myopic best response adjustment rule is transformed into an algebraic form based on the semi-tensor product of matrices. Second, a method for calculating weighted Nash equilibrium is presented, and the relationship between weighted Nash equilibrium and positive-probability fixed point is derived. Furthermore, a criterion is provided to verify whether the profiles in the feasible profile set can converge to the set of weighted Nash equilibriums with probability one. Finally, an illustrative example is given to support the new results obtained in this paper.  相似文献   

8.
It is well-known that the phenomenon of entanglement plays a fundamental role in quantum game theory. Occasionally, games constructed via maximally entangled initial states (MEIS) will have new Nash equilibria yielding to the players higher payoffs than the ones they receive in the classical version of the game. When examining these new games for Nash equilibrium payoffs, a fundamental question arises; does a suitable choice of an MEIS improve the lot of the players? In this paper, we show that the answer to this question is yes for at least the case of a variant of the well-known two player, two strategy game of Chicken. To that end, we generalize Landsburg’s quaternionic representation of the payoff function of two player, two strategy maximally entangled states to games where the initial state is chosen arbitrarily from a circle of maximally entangled initial states and for the corresponding quantized games show the existence of superior Nash equilibrium payoffs when an MEIS is appropriately chosen.  相似文献   

9.
The central result of classical game theory states that every finite normal form game has a Nash equilibrium, provided that players are allowed to use randomized (mixed) strategies. However, in practice, humans are known to be bad at generating random-like sequences, and true random bits may be unavailable. Even if the players have access to enough random bits for a single instance of the game their randomness might be insufficient if the game is played many times. In this work, we ask whether randomness is necessary for equilibria to exist in finitely repeated games. We show that for a large class of games containing arbitrary two-player zero-sum games, approximate Nash equilibria of the n-stage repeated version of the game exist if and only if both players have Ω(n) random bits. In contrast, we show that there exists a class of games for which no equilibrium exists in pure strategies, yet the n-stage repeated version of the game has an exact Nash equilibrium in which each player uses only a constant number of random bits. When the players are assumed to be computationally bounded, if cryptographic pseudorandom generators (or, equivalently, one-way functions) exist, then the players can base their strategies on “random-like” sequences derived from only a small number of truly random bits. We show that, in contrast, in repeated two-player zero-sum games, if pseudorandom generators do not exist, then Ω(n) random bits remain necessary for equilibria to exist.  相似文献   

10.
In a matrix game, the interactions among players are based on the assumption that each player has accurate information about the payoffs of their interactions and the other players are rationally self‐interested. As a result, the players should definitely take Nash equilibrium strategies. However, in real‐life, when choosing their optimal strategies, sometimes the players have to face missing, imprecise (i.e., interval), ambiguous lottery payoffs of pure strategy profiles and even compound strategy profile, which means that it is hard to determine a Nash equilibrium. To address this issue, in this paper we introduce a new solution concept, called ambiguous Nash equilibrium, which extends the concept of Nash equilibrium to the one that can handle these types of ambiguous payoff. Moreover, we will reveal some properties of matrix games of this kind. In particular, we show that a Nash equilibrium is a special case of ambiguous Nash equilibrium if the players have accurate information of each player's payoff sets. Finally, we give an example to illustrate how our approach deals with real‐life game theory problems.  相似文献   

11.
The class of weakly acyclic games, which includes potential games and dominance-solvable games, captures many practical application domains. In a weakly acyclic game, from any starting state, there is a sequence of better-response moves that leads to a pure Nash equilibrium; informally, these are games in which natural distributed dynamics, such as better-response dynamics, cannot enter inescapable oscillations. We establish a novel link between such games and the existence of pure Nash equilibria in subgames. Specifically, we show that the existence of a unique pure Nash equilibrium in every subgame implies the weak acyclicity of a game. In contrast, the possible existence of multiple pure Nash equilibria in every subgame is insufficient for weak acyclicity in general; here, we also systematically identify the special cases (in terms of the number of players and strategies) for which this is sufficient to guarantee weak acyclicity.  相似文献   

12.
In this paper we provide a logical framework for two-person finite games in strategic form, and use it to design a computer program for discovering some classes of games that have unique pure Nash equilibrium payoffs. The classes of games that we consider are those that can be expressed by a conjunction of two binary clauses, and our program re-discovered Kats and Thisse?s class of weakly unilaterally competitive two-person games, and came up with several other classes of games that have unique pure Nash equilibrium payoffs. It also came up with new classes of strict games that have unique pure Nash equilibria, where a game is strict if for both player different profiles have different payoffs.  相似文献   

13.
We consider the learning problem faced by two self-interested agents repeatedly playing a general-sum stage game. We assume that the players can observe each other’s actions but not the payoffs received by the other player. The concept of Nash Equilibrium in repeated games provides an individually rational solution for playing such games and can be achieved by playing the Nash Equilibrium strategy for the single-shot game in every iteration. Such a strategy, however can sometimes lead to a Pareto-Dominated outcome for games like Prisoner’s Dilemma. So we prefer learning strategies that converge to a Pareto-Optimal outcome that also produces a Nash Equilibrium payoff for repeated two-player, n-action general-sum games. The Folk Theorem enable us to identify such outcomes. In this paper, we introduce the Conditional Joint Action Learner (CJAL) which learns the conditional probability of an action taken by the opponent given its own actions and uses it to decide its next course of action. We empirically show that under self-play and if the payoff structure of the Prisoner’s Dilemma game satisfies certain conditions, a CJAL learner, using a random exploration strategy followed by a completely greedy exploitation technique, will learn to converge to a Pareto-Optimal solution. We also show that such learning will generate Pareto-Optimal payoffs in a large majority of other two-player general sum games. We compare the performance of CJAL with that of existing algorithms such as WOLF-PHC and JAL on all structurally distinct two-player conflict games with ordinal payoffs.  相似文献   

14.
多组对策系统中求解组与组之间的非劣Nash策略至关重要.如何针对一般问题解析求出非劣Nash策略还没有有效的方法.本文阐述了一种利用组与组之间的非劣反应集构造求解非劣Nash策略的迭代算法.为此首先引进多组对策系统组内部合作对策的最优均衡值和最优均衡解的概念,然后通过证明最优均衡解是组内部隐含某一权重向量的合作对策的非劣解,得到求解合作对策的单目标规划问题.进一步说明在组内部该问题的解不仅是非劣解而且对所有局中人都优于不合作时的Nash平衡策略.最后给出了验证该算法有效性的一个实际例子.  相似文献   

15.
We focus on the problem of computing approximate Nash equilibria and well-supported approximate Nash equilibria in random bimatrix games, where each player’s payoffs are bounded and independent random variables, not necessarily identically distributed, but with almost common expectations. We show that the completely mixed uniform strategy profile, i.e., the combination of mixed strategies (one per player) where each player plays with equal probability each one of her available pure strategies, is with high probability a $\sqrt{\frac{\ln n}{n}}$ -Nash equilibrium and a $\sqrt{\frac{3\ln n}{n}}$ -well supported Nash equilibrium, where n is the number of pure strategies available to each player. This asserts that the completely mixed, uniform strategy profile is an almost Nash equilibrium for random bimatrix games, since it is, with high probability, an ?-well-supported Nash equilibrium where ? tends to zero as n tends to infinity.  相似文献   

16.

Repeated quantum game theory addresses long-term relations among players who choose quantum strategies. In the conventional quantum game theory, single-round quantum games or at most finitely repeated games have been widely studied; however, less is known for infinitely repeated quantum games. Investigating infinitely repeated games is crucial since finitely repeated games do not much differ from single-round games. In this work, we establish the concept of general repeated quantum games and show the Quantum Folk Theorem, which claims that by iterating a game one can find an equilibrium strategy of the game and receive reward that is not obtained by a Nash equilibrium of the corresponding single-round quantum game. A significant difference between repeated quantum prisoner’s dilemma and repeated classical prisoner’s dilemma is that the classical Pareto optimal solution is not always an equilibrium of the repeated quantum game when entanglement is sufficiently strong. When entanglement is sufficiently strong and reward is small, mutual cooperation cannot be an equilibrium of the repeated quantum game. In addition, we present several concrete equilibrium strategies of the repeated quantum prisoner’s dilemma.

  相似文献   

17.
In game theory the interaction among players obligates each player to develop a belief about the possible strategies of the other players, to choose a best-reply given those beliefs, and to look for an adjustment of the best-reply and the beliefs using a learning mechanism until they reach an equilibrium point. Usually, the behavior of an individual cost-function, when such best-reply strategies are applied, turns out to be non-monotonic and concluding that such strategies lead to some equilibrium point is a non-trivial task. Even in repeated games the convergence to a stationary equilibrium is not always guaranteed. The best-reply strategies analyzed in this paper represent the most frequent type of behavior applied in practice in problems of bounded rationality of agents considered within the Artificial Intelligence research area. They are naturally related with the, so-called, fixed-local-optimal actions or, in other words, with one step-ahead optimization algorithms widely used in the modern Intelligent Systems theory.This paper shows that for an ergodic class of finite controllable Markov games the best-reply strategies lead necessarily to a Lyapunov/Nash equilibrium point. One of the most interesting properties of this approach is that an expedient (or absolutely expedient) behavior of an ergodic system (repeated game) can be represented by a Lyapunov-like function non-decreasing in time. We present a method for constructing a Lyapunov-like function: the Lyapunov-like function replaces the recursive mechanism with the elements of the ergodic system that model how players are likely to behave in one-shot games. To show our statement, we first propose a non-converging state-value function that fluctuates (increases and decreases) between states of the Markov game. Then, we prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-local-optimal strategy. As a result, we prove that a Lyapunov-like function can be built using the previous recursive expression for the Markov game, i.e., the resulting Lyapunov-like function is a monotonic function which can only decrease (or remain the same) over time, whatever the initial distribution of probabilities. As a result, a new concept called Lyapunov games is suggested for a class of repeated games. Lyapunov games allow to conclude during the game whether the applied strategy provides the convergence to an equilibrium point (or not). The time for constructing a Potential (Lyapunov-like) function is exponential. Our algorithm tractably computes the Nash, Lyapunov and the correlated equilibria: a Lyapunov equilibrium is a Nash equilibrium, as well it is also a correlated equilibrium. Validity of the proposed method is successfully demonstrated both theoretically and practically by a simulated experiment related to the Duel game.  相似文献   

18.
刘敏  王金环 《控制与决策》2024,39(2):545-550
基于矩阵半张量积方法研究智能电网需求侧管理问题.首先,基于势博弈的判定条件,利用势博弈对智能电网需求侧管理问题建模并构造相应的势函数;其次,当策略更新规则为时间级联型短视最优响应时,设计牵制控制使得势博弈在演化过程中镇定到最优纳什均衡;然后,在牵制控制设计过程中,为减少控制成本,设计算法得到尽可能少的控制玩家;最后,通过算例验证理论结果的有效性.  相似文献   

19.
Fictitious play has been used effectively to solve extended games. In this work, recursive games of survival are investigated by fictitious play. The game is modelled as a finite state sequential machine with each component game representing a different state.

Two types of algorithms are proposed to generate the desired behaviour strategy solutions.Two examples are investigated and the corresponding solutions compared.  相似文献   

20.
This paper considers the modeling and convergence of hyper-networked evolutionary games (HNEGs). In an HNEG the network graph is a hypergraph, which allows the fundamental network game to be a multi-player one. Using semi-tensor product of matrices and the fundamental evolutionary equation, the dynamics of an HNEG is obtained and we extend the results about the networked evolutionary games to show whether an HNEG is potential and how to calculate the potential. Then we propose a new strategy updating rule, called the cascading myopic best response adjustment rule (MBRAR), and prove that under the cascading MBRAR the strategies of an HNEG will converge to a pure Nash equilibrium. An example is presented and discussed in detail to demonstrate the theoretical and numerical results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号