首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
This paper discusses team-optimal closed-loop Stackelberg strategies for systems with slow and fast modes. It is established that the cost functions of the players in the pure slow and the full-order games have the same value in the limit as the small singular perturbation parameters tends to zero. It is shown that if the leader bases the design of his approximate strategy on the slow subsystem, while the follower bases his design on the full-order system, then the resulting solution is ill-posed. Moreover, if the fast information is incorporated in the approximate strategy of the leader, then it is shown that the singular perturbation technique of constructing approximate strategies by composing the slow and fast strategies is ill-posed and cannot be used in this problem. A new design methodology to construct approximate Stackelberg strategies by solving reduced-order problems, which have the same information structure as the full-order one, is presented. It is shown that the conditions for existenco and uniqueness of the solution of the full-order problem can be established through those conditions of the reduced-order problems. Finally, it is proved that the approximate strategies, besides being team near-optimal, possess the asymptotic Stackelberg property.  相似文献   

2.
It is shown that continuous-kernel nonzero-sum games with compact strategy spaces could admit both pure and mixed Stackelberg equilibrium solutions, if the cost function of each player is either nonquadratic or nonconvex in his own decision variable. In such a case, the mixed Stackelberg strategy will yield a lower average cost for the leader than the pure Stackelberg strategy. It is also verified that, if the cost functions of the players are quadratic and strictly convex, then only pure Stackelberg strategies can exist.  相似文献   

3.
A substantial effort has been devoted to various incentive Stackelberg solution concepts. Most of these concepts work well in the sense that the leader can get his desired solution in the end. Yet, most incentive strategies developed thus far include either the follower's control, which may not be realistic in practice, or delays in the state, which makes stabilization more difficult to achieve. In this paper, we obtain the team-optimal state feedback Stackelberg strategy (with no delays) of an important class of discrete-time two-person nonzero-sum dynamic games characterized by linear state dynamics and quadratic cost functionals.  相似文献   

4.
An optimal incentive strategy by which the leader suffers the least losses in punishing the follower's deviation from the decision desired by the leader is defined. Sufficient conditions for the existence of optimal incentive strategy are given. Static and dynamic leader-follower games with quadratic cost functionals are investigated. It is shown that leader-follower games with quadratic cost functionals admit optimal incentive strategies under the follower's decision variable being scalar, there exists a unique linear optimal incentive strategy. Such an incentive strategy can be explicitly determined  相似文献   

5.
The paper proposes a formulation of ε-Stackelberg and Stackelberg strategies for a large class of dynamic closed-loop games, discusses the interpretation of the leader's strategy as the formalization of the intuitive notion of incentives or threats, and considers limitations of the Stackelberg solution concept which, within the dynamic context, are applicable only in situations where the realization of a leader's strategy is ensured by a binding contract. The solution method is based on the idea of discontinuous strategies assuming that the leader punishes the follower by minimizing his payoff if the latter does not comply with the policy selected for him by the leader.  相似文献   

6.
本文动态对策协商的问题,提出了动态对策问题在协商 解处的诱导平衡的概念,研究了诱导平衡存在的必要条件和充分条件,并分析了线性二次型动态对策问题的诱导平衡。  相似文献   

7.
The derivation is given of the closed-loop Stackelberg strategy for a class of continuous three-player non-zero-sum differential games using the idea of the team optimal Stackelberg strategy. The game systems are described by linear state dynamics and quadratic objective functions. First, a definition of the three-player hierarchical equilibrium is given, then a general theorem which was developed by Basar (1981) is examined and applied to the game under consideration to deduce some sufficient conditions for the solution of the game to exist. A simple example is given.  相似文献   

8.
《Automatica》1985,21(5):575-584
In this paper we consider a general class of stochastic incentive decision problems in which the leader has access to the control value of the follower and to private as well as common information on the unknown state of nature. The follower's cost function depends on a finite number of parameters whose values are not known accurately by the leader, and in spite of this parametric uncertainty the leader seeks a policy which would induce the desired behavior on the follower. We obtain such policies for the leader, which are smooth, induce the desired behavior at the nominal values of these parameters, and furthermore make the follower's optimal reaction either minimally sensitive or totally insensitive to variations in the values of these parameters from the nominals. The general solution is determined by some orthogonality relations in some appropriately constructed (probability) measure spaces, and leads to particularly simple incentive policies. The features presented here are intrinsic to stochastic decision problems and have no counterparts in deterministic incentive problems.  相似文献   

9.
This paper deals with a class of many-person non-zero-sum differential games in which one player has the role of ‘ leader ’ while the others ‘ follow ’. Necessary conditions are obtained for the existence of open-loop Stackelberg solutions under the assumption that the followers respond to the leader by selecting Nash equilibrium controls. Some simple investment problems are described which give rise to discontinuous optimal controls for both leader and follower(s).  相似文献   

10.
The central result of classical game theory states that every finite normal form game has a Nash equilibrium, provided that players are allowed to use randomized (mixed) strategies. However, in practice, humans are known to be bad at generating random-like sequences, and true random bits may be unavailable. Even if the players have access to enough random bits for a single instance of the game their randomness might be insufficient if the game is played many times. In this work, we ask whether randomness is necessary for equilibria to exist in finitely repeated games. We show that for a large class of games containing arbitrary two-player zero-sum games, approximate Nash equilibria of the n-stage repeated version of the game exist if and only if both players have Ω(n) random bits. In contrast, we show that there exists a class of games for which no equilibrium exists in pure strategies, yet the n-stage repeated version of the game has an exact Nash equilibrium in which each player uses only a constant number of random bits. When the players are assumed to be computationally bounded, if cryptographic pseudorandom generators (or, equivalently, one-way functions) exist, then the players can base their strategies on “random-like” sequences derived from only a small number of truly random bits. We show that, in contrast, in repeated two-player zero-sum games, if pseudorandom generators do not exist, then Ω(n) random bits remain necessary for equilibria to exist.  相似文献   

11.
Both Stackelberg games and Nash games play extremely important roles in such fields as economics, management, politics and behavioral sciences. Stackelberg game can be modelled as a bilevel optimization problem. Static multi-leader-follower optimization problems are initially proposed by Pang and Fukushima. In this article, a discrete time dynamic version of multi-leader-follower games with feedback information is given and analyzed. There are two major contributions in this article. On one hand, based on the multi-leader-follower games, discrete time dynamic multi-leader-follower games are proposed. On the other hand, dynamic programming algorithms are presented to attack discrete time dynamic multi-leader-follower games with multi-players under feedback information structure for dependent followers.  相似文献   

12.
In this paper we consider a stochastic incentive decision problem with N > 1 followers and decentralized static information, where the leader's dynamic information comprises only a linear combination of the followers' actions. We obtain an incentive policy, affine in this dynamic information, which yields the same overall performance as the one the leader would obtain if he had observed the followers' actions separately. The existence conditions involved have been obtained explicitly for the case of finite probability spaces, and some challenging issues have been identified when the random variables are infinite valued. The results presented here have no counterparts in deterministic incentive problems.  相似文献   

13.
We consider the problem of finding equilibria in games with three agents on an oligopolic market with a linear demand function and nonlinear agent cost functions. Under strategic reflexion of the agents regarding the presence of a Stackelberg leader (leaders) of the first and second levels, we obtain expressions for information equilibria. Modeling real agent costs and demand functions of the Russian telecommunication market has allowed us to construct a set of information equilibria which we have compared with parameters of the real market and showed the presence of reflexion of the first and second ranks.  相似文献   

14.
This paper is concerned with the derivation of closed-loop Stackelberg (CLS) solutions of a class of continuous-time two-player nonzero-sum differential games characterized by linear state dynamics and quadratic cost functionals. Explicit conditions are obtained for both the finite and infinite horizon problems under which the CLS solution is a representation of the optimal feedback solution of a related team problem which is defined as the joint minimization of the leader's cost function. First, a specific class of representations is considered which depend linearly on the current and initial values of the state, and then the results are extended to encompass a more general class of linear strategies that also incorporate the whole past trajectory. The conditions obtained all involve solutions of linear matrix equations and are amenable to computational analysis for explicit determination of CLS strategies.  相似文献   

15.
Many contemporary computer games, notably action and role‐playing games, represent an interesting class of navigation‐intensive dynamic real‐time simulations inhabited by autonomous intelligent virtual agents (IVAs). Although higher level reasoning of IVAs in these domains seems suited for action planning, planning is not widely adopted in existing games and similar applications. Moreover, statistically rigorous study measuring performance of planners in decision making in a game‐like domain is missing. Here, five classical planners were connected to the virtual environment of Unreal Development Kit along with a planner for delete‐free domains (only positive preconditions and positive effects). Performance of IVAs employing those planners and IVAs with reactive architecture was measured on a class of game‐inspired test environments of various sizes and under different levels of external interference. The analysis has shown that planning agents outperform reactive agents if (i) the size of the problem is small or if (b) the environment changes are either hostile to the agent or infrequent. In delete‐free domains, specialized approaches are inferior to classical planners because the lower expressivity of delete‐free domains results in lower plan quality. These results can help to determine when planning is advantageous in games and for IVAs control in other dynamic real‐time environments.  相似文献   

16.
In many social phenomena, there exist multiple leaders and followers, and, positions of leaders and followers vary at each stage. Based on dynamic Stackelberg games with leaders in turn, this paper develops and characterizes discrete-time dynamic multi-leader–follower games with leaders in turn. To simplify the problem, all players in this game are divided into some fixed groups and all groups act as leaders in turn. A dynamic programming algorithm is employed to solve this model under feedback information structure.  相似文献   

17.
The Nash and Stackelberg strategies of a nonzero sum game have the common property that they are both noncooperative equilibrium solutions for which no player can achieve an improvement in his performance if he attempts to deviate from his strategy (cheat). In this note we show that the Nash solution is desirable only if it is not dominated by any of the Stackelberg solutions. Otherwise a Stackelberg strategy is always more favorable to both players and, as the Nash solution, it can be enforced once an agreement between the players, specifying the leader and the follower, is reached.  相似文献   

18.
The authors develop a new iterative approach toward the solution of a class of two-agent dynamic stochastic teams with nonclassical information when the coupling between the agents is weak, either through the state dynamics or through the information channel. In each case, the weak coupling is characterized in terms of a small (perturbation) parameter. When this parameter value (say, ∈) is set equal to zero, the original fairly complex dynamic team, with a nonclassical information pattern, is decomposed into or converted to relatively simple stochastic control or team problems, the solution of which makes up the zeroth-order approximation (in a function space) to the team-optical solution of the original problem. The fact that the zeroth-order solution approximates the optimal cost up to at least O(∈) is shown. It is also shown that approximations of all orders can be obtained by solving a sequence of stochastic control and/or simpler team problems  相似文献   

19.
How do we build algorithms for agent interactions with human adversaries? Stackelberg games are natural models for many important applications that involve human interaction, such as oligopolistic markets and security domains. In Stackelberg games, one player, the leader, commits to a strategy and the follower makes her decision with knowledge of the leader's commitment. Existing algorithms for Stackelberg games efficiently find optimal solutions (leader strategy), but they critically assume that the follower plays optimally. Unfortunately, in many applications, agents face human followers (adversaries) who — because of their bounded rationality and limited observation of the leader strategy — may deviate from their expected optimal response. In other words, human adversaries' decisions are biased due to their bounded rationality and limited observations. Not taking into account these likely deviations when dealing with human adversaries may cause an unacceptable degradation in the leader's reward, particularly in security applications where these algorithms have seen deployment. The objective of this paper therefore is to investigate how to build algorithms for agent interactions with human adversaries.To address this crucial problem, this paper introduces a new mixed-integer linear program (MILP) for Stackelberg games to consider human adversaries, incorporating: (i) novel anchoring theories on human perception of probability distributions and (ii) robustness approaches for MILPs to address human imprecision. Since this new approach considers human adversaries, traditional proofs of correctness or optimality are insufficient; instead, it is necessary to rely on empirical validation. To that end, this paper considers four settings based on real deployed security systems at Los Angeles International Airport (Pita et al., 2008 [35]), and compares 6 different approaches (three based on our new approach and three previous approaches), in 4 different observability conditions, involving 218 human subjects playing 2960 games in total. The final conclusion is that a model which incorporates both the ideas of robustness and anchoring achieves statistically significant higher rewards and also maintains equivalent or faster solution speeds compared to existing approaches.  相似文献   

20.
A computer package is presented, called POREM, for policy optimisation of linear dynamic, continuous-time models with constant coefficients and rational expectations of future events, based on infinite horizons and quadratic preferences. It is possible to calculate cooperative, decentralised Nash and decentralised Stackelberg outcomes and for each outcome it is possible to allow for pre-commitment and for lack of pre-commitment vis-à-vis private sector agents. It is possible to allow for hierarchical games, that is to allow for a group of Stackelberg leaders and a group of Stackelberg followers. The input of the model is very user-friendly and can be done with the aid of mnemonics. The package is programmed in PORTRAN77 and a single-precision version is available for personal computers  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号