首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 187 毫秒
1.
研究线性Markov切换系统的随机Nash微分博弈问题。首先借助线性Markov切换系统随机最优控制的相关结果,得到了有限时域和无线时域Nash均衡解的存在条件等价于其相应微分(代数) Riccati方程存在解,并给出了最优解的显式形式;然后应用相应的微分博弈结果分析线性Markov切换系统的混合H2/H∞控制问题;最后通过数值算例验证了所提出方法的可行性。  相似文献   

2.
基于小波多尺度逼近特性,提出了一种求解线性时变系统中微分对策Nash策略的新方法.该法避免求解耦合Riccati微分方程,而只需求解代数方程,适合于计算机求解.  相似文献   

3.
王涛  张化光 《控制与决策》2015,30(9):1674-1678

针对模型参数部分未知的随机线性连续时间系统, 通过策略迭代算法求解无限时间随机线性二次(LQ) 最优控制问题. 求解随机LQ最优控制问题等价于求随机代数Riccati 方程(SARE) 的解. 首先利用伊藤公式将随机微分方程转化为确定性方程, 通过策略迭代算法给出SARE 的解序列; 然后证明SARE 的解序列收敛到SARE 的解, 而且在迭代过程中系统是均方可镇定的; 最后通过仿真例子表明策略迭代算法的可行性.

  相似文献   

4.
多组对策系统中求解组与组之间的非劣Nash策略至关重要.如何针对一般问题解析求出非劣Nash策略还没有有效的方法.本文阐述了一种利用组与组之间的非劣反应集构造求解非劣Nash策略的迭代算法.为此首先引进多组对策系统组内部合作对策的最优均衡值和最优均衡解的概念,然后通过证明最优均衡解是组内部隐含某一权重向量的合作对策的非劣解,得到求解合作对策的单目标规划问题.进一步说明在组内部该问题的解不仅是非劣解而且对所有局中人都优于不合作时的Nash平衡策略.最后给出了验证该算法有效性的一个实际例子.  相似文献   

5.
基于小波多尺度逼近特性,提出了一种求解线性时变系统中微分对策Nash策略的新方法,该法避免求解耦合Riccati微分方程,而只需求解代解方程,适合于计算机求解。  相似文献   

6.
针对含扩散项不可靠随机生产系统最优生产控制的优化命题, 采用数值解方法来求解该优化命题最优控制所满足的模态耦合的非线性偏微分HJB方程. 首先构造Markov链来近似生产系统状态演化, 并基于局部一致性原理, 把求解连续时间随机控制问题转化为求解离散时间的Markov决策过程问题, 然后采用数值迭代和策略迭代算法来实现最优控制数值求解过程. 文末仿真结果验证了该方法的正确性和有效性.  相似文献   

7.
部分可观察Markov决策过程是通过引入信念状态空间将非Markov链问题转化为Markov链问题来求解,其描述真实世界的特性使它成为研究随机决策过程的重要分支.介绍了部分可观察Markov决策过程的基本原理和决策过程,提出一种基于策略迭代和值迭代的部分可观察Markov决策算法,该算法利用线性规划和动态规划的思想,解决当信念状态空间较大时出现的"维数灾"问题,得到Markov决策的逼近最优解.实验数据表明该算法是可行的和有效的.  相似文献   

8.
大量实际工程问题需要用同时包含连续和离散变量的Markov跳变系统来描述.本文介绍了一类随机激励的单自由度(强)非线性Markov跳变系统的稳态响应的研究方法.首先,基于随机平均法导出具有Markov跳变参数的平均It随机微分方程,原系统方程的维数得到降低.接着,根据跳变过程原理,建立Fokker-Planck-Kolmogorov(FPK)方程组,方程组中的方程与系统的结构状态一一对应且互相耦合.求解该FPK方程组,得到Markov跳变系统的稳态随机响应及其统计量.最后,以一个高斯白噪声激励的Markov跳变Duffing振子为例,计算得到不同跳变规律下系统的稳态响应.研究结果表明,Markov跳变系统的稳态响应可以看作是各结构状态子系统稳态响应的加权和,加权值由跳变规律决定.  相似文献   

9.
Markov控制过程基于性能势的平均代价最优策略   总被引:2,自引:1,他引:2  
研究了一类离散时间Markov控制过程平均代价性能最优控制决策问题.应用 Markov性能势的基本性质,在很一般性的假设条件下,直接导出了无限时间平均代价模型在紧 致行动集上的最优性方程及其解的存在性定理.提出了求解最优平稳控制策略的迭代算法,并 讨论了这种算法的收敛性问题.最后通过分析一个实例来说明这种算法的应用.  相似文献   

10.
研究小波逼近分析方法的收敛性问题, 对线性时变二次微分对策Nash策略情形, 证明了Nash策略的小波逼近解收敛于精确解, 基于小波逼近的多尺度多分辨特性, 给出了误差估计的阶数.  相似文献   

11.
This paper deals with the infinite horizon linear quadratic(LQ)differential games for discrete-time stochastic systems with both state and control dependent noise.The Popov-Belevitch-Hautus(PBH)criteria for exact observability and exact detectability of discrete-time stochastic systems are presented.By means of them,we give the optimal strategies (Nash equilibrium strategies)and the optimal cost values for infinite horizon stochastic differential games.It indicates that the infinite horizon LQ stochastic differential games are associated with four coupled matrix-valued equations.Furthermore, an iterative algorithm is proposed to solve the four coupled equations.Finally,an example is given to demonstrate our results.  相似文献   

12.
In this paper, we consider the feedback control on nonzero-sum linear quadratic (LQ) differential games in finite horizon for discrete-time stochastic systems with Markovian jump parameters and multiplicative noise. Four-coupled generalized difference Riccati equations (GDREs) are obtained, which are essential to find the optimal Nash equilibrium strategies and the optimal cost values of the LQ differential games. Furthermore, an iterative algorithm is given to solve the four-coupled GDREs. Finally, a suboptimal solution of the LQ differential games is proposed based on a convex optimization approach and a simplification of the suboptimal solution is given. Simulation examples are presented to illustrate the effectiveness of the iterative algorithm and the suboptimal solution.  相似文献   

13.
宋军  何舒平 《控制与决策》2016,31(3):559-563

基于Kleinman 迭代算法的框架, 提出两种数值迭代算法, 用于解决连续时间Markov 跳变系统的优化?? 控制器设计问题. 首先, 给出“ 直接并行Kleinman 迭代算法”, 并从正实算子的收敛性证明该算法的收敛性; 然后, 基于直接并行Kleinman 迭代算法, 提出一种更加广义的迭代算法结构, 即“ 广义并行Kleinman 迭代算法”, 并论述其包含的4 种情形; 最后, 通过数值示例验证了所提出算法的有效性.

  相似文献   

14.
A new online iterative algorithm for solving the H control problem of continuous‐time Markovian jumping linear systems is developed. For comparison, an available offline iterative algorithm for converging to the solution of the H control problem is firstly proposed. Based on the offline iterative algorithm and a new online decoupling technique named subsystems transformation method, a set of linear subsystems, which implementation in parallel, are obtained. By means of the adaptive dynamic programming technique, the two‐player zero‐sum game with the coupled game algebraic Riccati equation is solved online thereafter. The convergence of the novel policy iteration algorithm is also established. At last, simulation results have illustrated the effectiveness and applicability of these two methods. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

15.
This paper investigates the cluster synchronisation problem for multi-agent non-zero sum differential game with partially unknown dynamics. The objective is to design a controller to achieve the cluster synchronisation and to ensure local optimality of the performance index. With the definition of cluster tracking error and the concept of Nash equilibrium in the multi-agent system (MAS), the previous problem can be transformed into the problem of solving the coupled Hamilton–Jacobi–Bellman (HJB) equations. To solve these HJB equations, a data-based policy iteration algorithm is proposed with an actor–critic neural network (NN) structure in the case of the MAS with partially unknown dynamics; the weights of NNs are updated with the system data rather than the complete knowledge of system dynamics and the residual errors are minimised using the least-square approach. A simulation example is provided to verify the effectiveness of the proposed approach.  相似文献   

16.
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm, called “generalized value iteration ADP” algorithm, is developed to solve infinite horizon optimal tracking control problems for a class of discrete-time nonlinear systems. The developed generalized value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. Convergence property is developed to guarantee that the iterative performance index function will converge to the optimum. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the developed algorithm.  相似文献   

17.
18.
Model predictive control (MPC) for Markovian jump linear systems with probabilistic constraints has received much attention in recent years. However, in existing results, the disturbance is usually assumed with infinite support, which is not considered reasonable in real applications. Thus, by considering random additive disturbance with finite support, this paper is devoted to a systematic approach to stochastic MPC for Markovian jump linear systems with probabilistic constraints. The adopted MPC law is parameterized by a mode‐dependent feedback control law superimposed with a perturbation generated by a dynamic controller. Probabilistic constraints can be guaranteed by confining the augmented system state to a maximal admissible set. Then, the MPC algorithm is given in the form of linearly constrained quadratic programming problems by optimizing the infinite sum of derivation of the stage cost from its steady‐state value. The proposed algorithm is proved to be recursively feasible and to guarantee constraints satisfaction, and the closed‐loop long‐run average cost is not more than that of the unconstrained closed‐loop system with static feedback. Finally, when adopting the optimal feedback gains in the predictive control law, the resulting MPC algorithm has been proved to converge in the mean square sense to the optimal control. A numerical example is given to verify the efficiency of the proposed results.  相似文献   

19.
It is well known that stochastic control systems can be viewed as Markov decision processes (MDPs) with continuous state spaces. In this paper, we propose to apply the policy iteration approach in MDPs to the optimal control problem of stochastic systems. We first provide an optimality equation based on performance potentials and develop a policy iteration procedure. Then we apply policy iteration to the jump linear quadratic problem and obtain the coupled Riccati equations for their optimal solutions. The approach is applicable to linear as well as nonlinear systems and can be implemented on-line on real world systems without identifying all the system structure and parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号