首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Fictitious play is a simple learning algorithm for strategic games that proceeds in rounds. In each round, the players play a best response to a mixed strategy that is given by the empirical frequencies of actions played in previous rounds. There is a close relationship between fictitious play and the Nash equilibria of a game: if the empirical frequencies of fictitious play converge to a strategy profile, this strategy profile is a Nash equilibrium. While fictitious play does not converge in general, it is known to do so for certain restricted classes of games, such as constant-sum games, non-degenerate 2×n games, and potential games. We study the rate of convergence of fictitious play and show that, in all the classes of games mentioned above, fictitious play may require an exponential number of rounds (in the size of the representation of the game) before some equilibrium action is eventually played. In particular, we show the above statement for symmetric constant-sum win-lose-tie games.  相似文献   

2.
We focus on Nash equilibria and Pareto optimal Nash equilibria for a finite horizon noncooperative dynamic game with a special structure of the stage cost. We study the existence of these solutions by proving that the game is a potential game. For the single-stage version of the game, we characterize the aforementioned solutions and derive a consensus protocol that makes the players converge to the unique Pareto optimal Nash equilibrium. Such an equilibrium guarantees the interests of the players and is also social optimal in the set of Nash equilibria. For the multistage version of the game, we present an algorithm that converges to Nash equilibria, unfortunately, not necessarily Pareto optimal. The algorithm returns a sequence of joint decisions, each one obtained from the previous one by an unilateral improvement on the part of a single player. We also specialize the game to a multiretailer inventory system.  相似文献   

3.
We consider a continuous-time form of repeated matrix games in which player strategies evolve in reaction to opponent actions. Players observe each other's actions, but do not have access to other player utilities. Strategy evolution may be of the best response sort, as in fictitious play, or a gradient update. Such mechanisms are known to not necessarily converge. We introduce a form of "dynamic" fictitious and gradient play strategy update mechanisms. These mechanisms use derivative action in processing opponent actions and, in some cases, can lead to behavior converging to Nash equilibria in previously nonconvergent situations. We analyze convergence in the case of exact and approximate derivative measurements of the dynamic update mechanisms. In the ideal case of exact derivative measurements, we show that convergence to Nash equilibrium can always be achieved. In the case of approximate derivative measurements, we derive a characterization of local convergence that shows how the dynamic update mechanisms can converge if the traditional static counterparts do not. We primarily discuss two player games, but also outline extensions to multiplayer games. We illustrate these methods with convergent simulations of the well known Shapley and Jordan counterexamples.  相似文献   

4.
This paper proposes an object-level rate control algorithm to jointly controlling the bit rates of multiple video objects. Utilizing noncooperative game theory, the proposed rate control algorithm mimics the behaviors of players representing video objects. Each player competes for available bits to optimize its visual quality. The algorithm finds an “optimal solution” in that it conforms to the mixed strategy Nash equilibrium, which is the probability distribution of the actions carried by the players that maximizes their expected payoffs (the number of bits). The game is played iteratively, and the expected payoff of each play is accumulated. The game terminates when all of the available bits for the specific time instant have been distributed to video object planes (VOPs). The advantage of the proposed scheme is that the bidding objects divide the bits among themselves automatically and fairly, according to their encoding complexity, and with an overall solution that is strategically optimal under the given circumstances. To minimize buffer fluctuation and avoid buffer overflow and underflow, a proportional-integral-derivative (PID) control based buffer policy is utilized.   相似文献   

5.
We provide a simple learning process that enables an agent to forecast a sequence of outcomes. Our forecasting scheme, termed tracking forecast, is based on tracking the past observations while emphasizing recent outcomes. As opposed to other forecasting schemes, we sacrifice universality in favor of a significantly reduced memory requirements. We show that if the sequence of outcomes has certain properties—it has some internal (hidden) state that does not change too rapidly—then the tracking forecast is weakly calibrated so that the forecast appears to be correct most of the time. For binary outcomes, this result holds without any internal state assumptions. We consider learning in a repeated strategic game where each player attempts to compute some forecast of the opponent actions and play a best response to it. We show that if one of the players uses a tracking forecast, while the other player uses a standard learning algorithm (such as exponential regret matching or smooth fictitious play), then the player using the tracking forecast obtains the best response to the actual play of the other players. We further show that if both players use tracking forecast, then under certain conditions on the game matrix, convergence to a Nash equilibrium is possible with positive probability for a larger class of games than the class of games for which smooth fictitious play converges to a Nash equilibrium.  相似文献   

6.
The Nash and Stackelberg strategies of a nonzero sum game have the common property that they are both noncooperative equilibrium solutions for which no player can achieve an improvement in his performance if he attempts to deviate from his strategy (cheat). In this note we show that the Nash solution is desirable only if it is not dominated by any of the Stackelberg solutions. Otherwise a Stackelberg strategy is always more favorable to both players and, as the Nash solution, it can be enforced once an agreement between the players, specifying the leader and the follower, is reached.  相似文献   

7.
Networked noncooperative games are investigated, where each player (or agent) plays with all other players in its neighborhood. Assume the evolution is based on the fact that each player uses its neighbors' current information to decide its next strategy. By using sub-neighborhood, the dynamics of the evolution is obtained. Then a method for calculating Nash equilibriums from mixed strategies of multi-players is proposed. The relationship between local Nash equilibriums based on individual neighborhoods and global Nash equilibriums of overall network is revealed. Then a technique is proposed to construct Nash equilibriums of an evolutionary game from its one step static Nash equilibriums. The basic tool of this approach is the semi-tensor product of matrices, which converts strategies into logical matrices and payoffs into pseudo-Boolean functions, then networked evolutionary games become discrete time dynamic systems.   相似文献   

8.
We consider the learning problem faced by two self-interested agents repeatedly playing a general-sum stage game. We assume that the players can observe each other’s actions but not the payoffs received by the other player. The concept of Nash Equilibrium in repeated games provides an individually rational solution for playing such games and can be achieved by playing the Nash Equilibrium strategy for the single-shot game in every iteration. Such a strategy, however can sometimes lead to a Pareto-Dominated outcome for games like Prisoner’s Dilemma. So we prefer learning strategies that converge to a Pareto-Optimal outcome that also produces a Nash Equilibrium payoff for repeated two-player, n-action general-sum games. The Folk Theorem enable us to identify such outcomes. In this paper, we introduce the Conditional Joint Action Learner (CJAL) which learns the conditional probability of an action taken by the opponent given its own actions and uses it to decide its next course of action. We empirically show that under self-play and if the payoff structure of the Prisoner’s Dilemma game satisfies certain conditions, a CJAL learner, using a random exploration strategy followed by a completely greedy exploitation technique, will learn to converge to a Pareto-Optimal solution. We also show that such learning will generate Pareto-Optimal payoffs in a large majority of other two-player general sum games. We compare the performance of CJAL with that of existing algorithms such as WOLF-PHC and JAL on all structurally distinct two-player conflict games with ordinal payoffs.  相似文献   

9.
In a matrix game, the interactions among players are based on the assumption that each player has accurate information about the payoffs of their interactions and the other players are rationally self‐interested. As a result, the players should definitely take Nash equilibrium strategies. However, in real‐life, when choosing their optimal strategies, sometimes the players have to face missing, imprecise (i.e., interval), ambiguous lottery payoffs of pure strategy profiles and even compound strategy profile, which means that it is hard to determine a Nash equilibrium. To address this issue, in this paper we introduce a new solution concept, called ambiguous Nash equilibrium, which extends the concept of Nash equilibrium to the one that can handle these types of ambiguous payoff. Moreover, we will reveal some properties of matrix games of this kind. In particular, we show that a Nash equilibrium is a special case of ambiguous Nash equilibrium if the players have accurate information of each player's payoff sets. Finally, we give an example to illustrate how our approach deals with real‐life game theory problems.  相似文献   

10.
查旭  左斌  胡云安 《控制与决策》2006,21(10):1167-1171
针对如何解算n人非合作的动态博弈对策中的纳什均衡解问题,提出一种利用退火回归神经网络极值搜索算法解算纳什均衡解的方法.在动态博弈对策问题中,将每个竞争者视为一个代价函数,利用此算法可以使每个代价函数均收敛于其最小值,从而获得此对策的纳什均衡解.此算法不限制代价函数的具体形式,同时由于摒弃了正弦激励信号,解决了一般极值搜索算法中存在的输出量“颤动”现象和控制量来回切换问题,改善了系统的动态性能.  相似文献   

11.
We consider properties of constrained games, where the strategy set available to a player depends on the choice of strategies made by other players. We show that the utilities of each player associated with that player's own performance and constraints are not sufficient to model a constrained game and to define equilibria; for the latter, one also needs to model how a player values the fact that other players meet their constraints. We study three different approaches to other players' constraints, and show that they exhibit completely different equilibrium behaviors. Further, we study a general class of stochastic games with partial information, and focus on the case where the players are indifferent to whether the constraints of other players hold.  相似文献   

12.
It is well-known that the phenomenon of entanglement plays a fundamental role in quantum game theory. Occasionally, games constructed via maximally entangled initial states (MEIS) will have new Nash equilibria yielding to the players higher payoffs than the ones they receive in the classical version of the game. When examining these new games for Nash equilibrium payoffs, a fundamental question arises; does a suitable choice of an MEIS improve the lot of the players? In this paper, we show that the answer to this question is yes for at least the case of a variant of the well-known two player, two strategy game of Chicken. To that end, we generalize Landsburg’s quaternionic representation of the payoff function of two player, two strategy maximally entangled states to games where the initial state is chosen arbitrarily from a circle of maximally entangled initial states and for the corresponding quantized games show the existence of superior Nash equilibrium payoffs when an MEIS is appropriately chosen.  相似文献   

13.
We consider a continuous-time version of fictitious play (FP), in which interacting players evolve their strategies in reaction to their opponents' actions without knowledge of their opponents' utilities. It is known that FP need not converge, but that convergence is possible in certain special cases including zero-sum games, identical interest games, and two-player/two-move games. We provide a unified proof of convergence in all of these cases by showing that a Lyapunov function previously introduced for zero-sum games also can establish stability in the other special cases. We go on to consider a two-player game in which only one player has two-moves and use properties of planar dynamical systems to establish convergence.  相似文献   

14.
We consider congestion games with linear latency functions in which each player is aware only of a subset of all the other players. This is modeled by means of a social knowledge graph G in which nodes represent players and there is an edge from i to j if i knows j. Under the assumption that the payoff of each player is affected only by the strategies of the adjacent ones, we first give a complete characterization of the games possessing pure Nash equilibria. Namely, if the social graph G is undirected, the game is an exact potential game and thus isomorphic to a classical congestion game. As a consequence, it always converges and possesses Nash equilibria. On the other hand, if G is directed an equilibrium is not guaranteed to exist, but the game is always convergent and an equilibrium can be found in polynomial time if G is acyclic, even if finding the best equilibrium remains an intractable problem.  相似文献   

15.
We study a two‐player two‐fare‐class static single‐period capacity allocation game with complete information. Nonnested (partitioned) booking limit policies are investigated in both noncooperative and cooperative situations. We show the existence of unique Nash equilibrium in the noncooperative situation. In the cooperative game, we analyze the cost saving of the two players and investigate the concavity of the objective function. For both noncooperative and cooperative settings, we assume the demands to be a truncated normal distribution and provide a comprehensive sensitivity analysis to discover the effects of unit revenue, rejection cost, and transfer rate on the equilibrium solution. Our numerical experiments show that the nonnested model can be a good approximation to the nested booking limit model. For the cooperative setting, we identify conditions that give rise to improvements in the total system revenue. Finally, under each game‐theoretic setting, we present the managerial implications of our solutions along with numerical examples.  相似文献   

16.
针对多用户多优先级网络系统的管理问题,利用对策论中的Nash平衡和激励Stackelberg策略等相关概念,提出了理想状态下的激励价控策略设计.在系统的动态平衡状态下,利用信息量的瞬时变化率及用户与平衡点的偏离,给出了非线性交叉干扰的多激励价控策略,加强了用户与网络管理者的合作性,激励和引导非合作用户选取对系统整体有益的服务请求,以提高网络资源的利用率.  相似文献   

17.
In this paper, a Cournot game in an oligopolistic market with incomplete information is considered. The market consists of some producers that compete for getting higher payoffs. For optimal decision making, each player needs to estimate its rivals’ behaviors. This estimation is carried out using linear regression and recursive weighted least-squares method. As the information of each player about its rivals increases during the game, its estimation of their reaction functions becomes more accurate. Here, it is shown that by choosing appropriate regressors for estimating the strategies of other players at each time-step of the market and using them for making the next step decision, the game will converge to its Nash equilibrium point. The simulation results for an oligopolistic market show the effectiveness of the proposed method.  相似文献   

18.
Unlike standard congestion games, weighted congestion games and congestion games with player-specific delay functions do not necessarily possess pure Nash equilibria. It is known, however, that there exist pure equilibria for both of these variants in the case of singleton congestion games, i.e., if the players’ strategy spaces contain only sets of cardinality one. In this paper, we investigate how far such a property on the players’ strategy spaces guaranteeing the existence of pure equilibria can be extended. We show that both weighted and player-specific congestion games admit pure equilibria in the case of matroid congestion games, i.e., if the strategy space of each player consists of the bases of a matroid on the set of resources. We also show that the matroid property is the maximal property that guarantees pure equilibria without taking into account how the strategy spaces of different players are interweaved.  相似文献   

19.
We study the computational complexity of problems involving equilibria in strategic games and in perfect information extensive games when the number of players is large. We consider, among others, the problems of deciding the existence of a pure Nash equilibrium in strategic games or deciding the existence of a pure Nash or a subgame perfect Nash equilibrium with a given payoff in finite perfect information extensive games. We address the fundamental question of how can we represent a game with a large number of players? We propose three ways of representing a game with different degrees of succinctness for the components of the game. For perfect information extensive games we show that when the number of moves of each player is large and the input game is represented succinctly these problems are PSPACE-complete. In contraposition, when the game is described explicitly by means of its associated tree all these problems are decidable in polynomial time. For strategic games we show that the complexity of deciding the existence of a pure Nash equilibrium depends on the succinctness of the game representation and then on the size of the action sets. In particular we show that it is NP-complete, when the number of players is large and the number of actions for each player is constant, and that the problem is -complete when the number of players is a constant and the size of the action sets is exponential in the size of the game representation. Again when the game is described explicitly the problem is decidable in polynomial time.  相似文献   

20.
In this paper, we apply the game theory to study some strategic actions for retailers to fight a price war. We start by modeling a noncooperative pure pricing game among multiple competing retailers who sell a certain branded product under price-dependent stochastic demands. A unique Nash equilibrium is proven to exist under some mild conditions. We demonstrate mathematically the incentives for retailers to start a price war. Based on a strategic framework via the game theory, we illustrate the use of service level to build price walls which can prevent a huge drop in price, as well as profit. Three kinds of price walls are proposed, and the respective strengths and weaknesses have been studied. Analytical conditions, under which a price wall can effectively prevent big drops in both market share and profit, are developed. Aside from the proposed price walls, two other pricing strategies, which can lead to an all-win situation, are examined.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号