首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
B. Tołwiński 《Automatica》1982,18(4):431-441
The paper proposes an equilibrium solution concept for dynamic games where players can communicate with one another, but cannot make contractual agreements. In such games, unlike the static problems without contracting possibilities, the cooperation between players is possible due to the fact that the realization of negotiated agreements can be enforced by suitably-defined strategies. The definition presented combines dynamic programming, the theory of bargaining and the notion of enforceable agreements to produce a class of cooperative solutions defined in the form of memory Nash equilibria satisfying the principle of optimality along the equilibrium trajectory. The choice of a particular solution in this class depends on players' expected actions in case of disagreement, and on an adopted negotiation scheme formalized in the form of a bargaining model. Possible formulations of disagreement policies and bargaining models are discussed in some detail.  相似文献   

2.
如何评价分析信息安全技术已成为当前的研究热点.本文基于攻防博弈模型对由防火墙、入侵检测系统构成的安全体系进行了分析,求出了阶段博弈模型的混合策略纳什均衡解.在阶段博弈分析的基础上,引入重复博弈的概念对模型进行了多阶段的动态博弈分析.研究表明,信息安全技术配置直接影响攻防双方的行为变化,贴现因子与入侵概率存在密切关系,从防御方的角度看,入侵概率的准确预测对其策略的选择具有重要影响.因此,作为防御方应积极记录、分析和量化攻击的方式、目标、数量及类型,进而优化配置,这将有效提高应用信息安全技术的效用.  相似文献   

3.
This paper recasts the Friesz et al. (1993) measure theoretic model of dynamic network user equibrium as a controlled variational inequality problem involving Riemann integrals. This restatement is done to make the model and its foundations accessible to a wider audience by removing the need to have a background in functional analysis. Our exposition is dependent on previously unavailable necessary conditions for optimal control problems with state-dependent time lags. These necessary conditions, derived in an Appendix, are employed to show that a particular variational inequality control problem has solutions that are dynamic network user equilibria. Our analysis also shows that use of proper flow propagation constraints obviates the need to explicitly employ the arc exit time functions that have complicated numerical implementations of the Friesz et al. (1993) model heretofore. We close by describing the computational implications of numerically determining dynamic user equilibria from formulations based on state-dependent time lags.  相似文献   

4.
基于博弈论的动态频谱分配技术研究   总被引:1,自引:1,他引:1  
提出了一种改进的动态频谱分配博弈模型,对现有的频谱定价函数进行改进,在授权用户对单位频谱价格满意的条件下,频谱价格与授权用户提供带宽数量和次用户的频谱需求数量有关。此外,在次用户的效用函数中,考虑了频谱置换参数,并分析了置换参数和信道质量对次用户动态博弈以及次用户达到纳什均衡的影响。最后,分别用静态博弈和动态博弈分析了次用户之间的竞争频谱行为,并通过仿真验证,次用户的策略最终可以收敛到纳什均衡。  相似文献   

5.
基于非合作动态博弈的网络安全主动防御技术研究   总被引:5,自引:0,他引:5  
目前基于博弈的网络安全主动防御技术大多采用静态博弈方式.针对这种静态方式无法应对攻击者攻击意图和攻击策略动态变化的不足,基于非合作、非零和动态博弈理论提出了完全信息动态博弈主动防御模型.通过"虚拟节点"将网络攻防图转化为攻防博弈树,并给出了分别适应于完全信息和非完全信息两种场景的攻防博弈算法.理论分析和实验表明相关算法...  相似文献   

6.
针对异构无线网络中的带宽分配问题,将网络间的带宽分配模型转化为非合作博弈模型,提出一种基于非合作博弈论的动态带宽分配( NCRA)算法。首先,根据用户的带宽需求,并充分考虑当前各种不同网络的负载因素,设计一种基于网络能力动态为用户分配带宽的效用函数;然后,通过证明效用函数为凹函数来验证网络间非合作博弈纳什均衡的存在性,并获得网络间的最佳带宽分配策略;最后,通过仿真实验,验证所提出算法的性能。  相似文献   

7.
The purpose of this article is to show that the differential dynamic programming (DDP) algorithm may be readily adapted to cater for state inequality constrained continuous optimal control problems. In particular, a new approach using a multiplier penalty function scheme incorporated with the DDP algorithm is shown to be effective. The DDP algorithm, implemented in conjunction with a multiplier penalty function scheme, is compared to an established DDP algorithm variant and the gradient-restoration method.  相似文献   

8.
Differential games with discontinuous dynamics and a fixed termination time are considered. The operators describing the structure of a game are determined under certain conditions. The results obtained are illustrated by a model example. Translated from Kibernetika i Sistemnyi Analiz, No. 4, pp. 183–187, July–August, 2000.  相似文献   

9.
We consider multistage stochastic linear optimization problems combining joint dynamic probabilistic constraints with hard constraints. We develop a method for projecting decision rules onto hard constraints of wait-and-see type. We establish the relation between the original (infinite-dimensional) problem and approximating problems working with projections from different subclasses of decision policies. Considering the subclass of linear decision rules and a generalized linear model for the underlying stochastic process with noises that are Gaussian or truncated Gaussian, we show that the value and gradient of the objective and constraint functions of the approximating problems can be computed analytically.  相似文献   

10.
基于重复博弈的理性秘密分享机制,首先由Maleka和Shareef提出,他们认为不存在常数轮的重复理性秘密分享机制(Repeated Rational Secret Sharing Scheme,RRSSS)。然而,无限轮RRSSS效率低下,不具备应用价值。为了实现高效的常数轮RRSSS,为参与者设置了不同的类型,提出了不完全信息下的常数轮RRSSS机制,并证明了机制的有效性。与其他理性秘密分享方案比较,在给定条件下,新方案在(纳什)均衡、期望执行时间和通信信道方面均具有优势。  相似文献   

11.
A model of dynamic networks is introduced which incorporates three kinds of network changes: deletion of nodes (by faults or sabotage), restoration of nodes (by actions of “repair”), and creation of nodes (by actions that extend the network). The antagonism between the operations of deletion and restoration resp. creation is modelled by a game between the two agents “Destructor” and “Constructor”. In this framework of dynamic model-checking, we consider as specifications (“winning conditions” for Constructor) elementary requirements on connectivity of those networks which are reachable from some initial given network. We show some basic results on the (un-)decidability and hardness of dynamic model-checking problems.  相似文献   

12.
Tamer Başar 《Automatica》1981,17(5):749-754
This paper considers noncooperative equilibria of three-player dynamic games with three levels of hierarchy in decision making. In this context, first a general definition of a hierarchical equilibrium solution is given, which also accounts for nonunique responses of the players who are not at the top of the hierarchy. Then, a general theorem is proven which provides a set of sufficient conditions for a triple of strategies to be in hierarchical equilibrium. When applied to linear-quadratic games, this theorem provides conditions under which there exists a linear one-step memory strategy for the player (say, J1) at the top of the hierarchy, which forces the other two players to act in such a way so as to jointly minimize the cost function of J1. Furthermore, there exists a linear one-step memory strategy for the second-level player (say, J2), which forces the remaining player to jointly minimize the cost function of J2 under the declared equilibrium strategy of J1. A numerical example included in the paper illustrates the results and the convergence property of the equilibrium strategies, as the number of stages in the game becomes arbitrarily large.  相似文献   

13.
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games, e.g., StarCraft and poker. Neural Fictitious Self-Play (NFSP) is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge. However, it needs to train a neural network in an off-policy manner to approximate the action values. For games with large search spaces, the training may suffer from unnecessary exploration and sometimes fails to converge. In this paper, we propose a new Neural Fictitious Self-Play algorithmthat combinesMonte Carlo tree search with NFSP, called MC-NFSP, to improve the performance in real-time zero-sum imperfect-information games. With experiments and empirical analysis, we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not. Furthermore, we develop an Asynchronous Neural Fictitious Self-Play framework (ANFSP). It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality. The experiments with th e games with hidden state information (Texas Hold’em), and the FPS (firstperson shooter) games demonstrate effectiveness of our algorithms.  相似文献   

14.
This article aims to provide a simple approach realising state observation and input estimation simultaneously for discrete-time LTV systems in H setting. Through solving a two-player zero sum differential game, appealing results are obtained in two folds. First, necessary and sufficient solvability conditions for state and input simultaneous estimation problem are given in terms of solution to a set of difference Riccati recursion. Second, one estimator is presented with special innovation structure, where innovation information is used to update state observation tuned by gain matrix and to provide input estimation through a projector matrix, where gain matrix and projector matrix are constructed from solution to difference Riccati recursion. At last, simulation results are provided to justify proposed approach.  相似文献   

15.
Goal Programming (GP) is applied to modelling the decision making processes in the well‐known Ultimatum Game and some of its variations. The decision model for a player is a Chebychev GP model that balances her individual desires with the mental model she has of the desires of other relevant players. Fairness is modelled as a universal mechanism, allowing players to differ in their belief of what a fair solution should be in any particular game. The model's conceptual framework draws upon elements considered of importance in the field of cognitive neuroscience, and results from the field of psychology are used to further specify the types of goals in the model. Computer simulations of the GP models, testing a number of Ultimatum, Dictator and Double‐Blind Dictator Games, lead to distributions of proposals made and accepted that correspond reasonably well with experimental findings.  相似文献   

16.
This paper is concerned with the derivation of closed-loop Stackelberg (CLS) solutions of a class of continuous-time two-player nonzero-sum differential games characterized by linear state dynamics and quadratic cost functionals. Explicit conditions are obtained for both the finite and infinite horizon problems under which the CLS solution is a representation of the optimal feedback solution of a related team problem which is defined as the joint minimization of the leader's cost function. First, a specific class of representations is considered which depend linearly on the current and initial values of the state, and then the results are extended to encompass a more general class of linear strategies that also incorporate the whole past trajectory. The conditions obtained all involve solutions of linear matrix equations and are amenable to computational analysis for explicit determination of CLS strategies.  相似文献   

17.
Both Stackelberg games and Nash games play extremely important roles in such fields as economics, management, politics and behavioral sciences. Stackelberg game can be modelled as a bilevel optimization problem. Static multi-leader-follower optimization problems are initially proposed by Pang and Fukushima. In this article, a discrete time dynamic version of multi-leader-follower games with feedback information is given and analyzed. There are two major contributions in this article. On one hand, based on the multi-leader-follower games, discrete time dynamic multi-leader-follower games are proposed. On the other hand, dynamic programming algorithms are presented to attack discrete time dynamic multi-leader-follower games with multi-players under feedback information structure for dependent followers.  相似文献   

18.
ABSTRACT

This paper investigates the zero-sum differential game problem for a class of uncertain nonlinear pure-feedback systems with output constraints and unknown external disturbances. A barrier Lyapunov function is introduced to tackle the output constraints. By constructing an affine variable at each dynamic surface control design step rather than utilising the mean-value theorem, the tracking control problem for pure-feedback systems can be transformed into an equivalent zero-sum differential game problem for affine systems. Then, the solution of associated Hamilton–Jacobi–Isaacs equation can be obtained online by using the adaptive dynamic programming technique. Finally, the whole control scheme that is composed of a feedforward dynamic surface controller and a feedback differential game control strategy guarantees the stability of the closed-loop system, and the tracking error is remained in a bounded compact set. The simulation results demonstrate the effectiveness of the proposed control scheme.  相似文献   

19.
近年来,铁路突发事件时有发生,严重影响铁路的正常运营,合理地进行应急资源的调度是提高铁路整体应急救援能力,减少突发事件所造成损失的有效途径。以博弈论为理论基础,将各应急点看作博弈局中人,考虑救援点到应急点的运力限制以及不同资源在不同应急点的重要度等因素,构建了资源动态需求函数,并用应急点对资源缺少量的时间累积来刻画系统损失。将多应急点的资源调度描述为一个多阶段非合作博弈过程,以系统总损失最小为目标,建立多应急点-多救援点-多种资源的动态多阶段资源调度模型,并设计了求解该模型Nash均衡的改进布谷鸟算法,从而得到最优的铁路应急资源调度方案。通过具体算例验证了模型的可行性与算法的优越性。结果表明该模型较为切近实际、适用性较强且改进后的算法更具高效性,可为铁路应急资源调度决策提供依据和支持。  相似文献   

20.
Motivated by applications in social and peer-to-peer networks, we introduce the Bounded Budget Connection (BBC) game and study its pure Nash equilibria. We have a collection of n   players, each with a budget for purchasing links. Each link has a cost and a length. Each node has a preference weight for each node, and its objective is to purchase outgoing links within its budget to minimize its sum of preference-weighted distances to the nodes. We show that determining if a BBC game has pure Nash equilibria is NP-hard. We study (n,k)(n,k)-uniform BBC games, where all link costs, lengths, and preferences are equal and every budget equals k  . We show that pure Nash equilibria exist for all (n,k)(n,k)-uniform BBC games and all equilibria are essentially fair. We construct a family of equilibria spanning the spectrum from minimum to maximum social cost. We also analyze best-response walks and alternative node objectives.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号