共查询到19条相似文献,搜索用时 187 毫秒
1.
2.
3.
针对模型参数部分未知的随机线性连续时间系统, 通过策略迭代算法求解无限时间随机线性二次(LQ) 最优控制问题. 求解随机LQ最优控制问题等价于求随机代数Riccati 方程(SARE) 的解. 首先利用伊藤公式将随机微分方程转化为确定性方程, 通过策略迭代算法给出SARE 的解序列; 然后证明SARE 的解序列收敛到SARE 的解, 而且在迭代过程中系统是均方可镇定的; 最后通过仿真例子表明策略迭代算法的可行性.
相似文献4.
多组对策系统中求解组与组之间的非劣Nash策略至关重要.如何针对一般问题解析求出非劣Nash策略还没有有效的方法.本文阐述了一种利用组与组之间的非劣反应集构造求解非劣Nash策略的迭代算法.为此首先引进多组对策系统组内部合作对策的最优均衡值和最优均衡解的概念,然后通过证明最优均衡解是组内部隐含某一权重向量的合作对策的非劣解,得到求解合作对策的单目标规划问题.进一步说明在组内部该问题的解不仅是非劣解而且对所有局中人都优于不合作时的Nash平衡策略.最后给出了验证该算法有效性的一个实际例子. 相似文献
5.
6.
针对含扩散项不可靠随机生产系统最优生产控制的优化命题, 采用数值解方法来求解该优化命题最优控制所满足的模态耦合的非线性偏微分HJB方程. 首先构造Markov链来近似生产系统状态演化, 并基于局部一致性原理, 把求解连续时间随机控制问题转化为求解离散时间的Markov决策过程问题, 然后采用数值迭代和策略迭代算法来实现最优控制数值求解过程. 文末仿真结果验证了该方法的正确性和有效性. 相似文献
7.
部分可观察Markov决策过程是通过引入信念状态空间将非Markov链问题转化为Markov链问题来求解,其描述真实世界的特性使它成为研究随机决策过程的重要分支.介绍了部分可观察Markov决策过程的基本原理和决策过程,提出一种基于策略迭代和值迭代的部分可观察Markov决策算法,该算法利用线性规划和动态规划的思想,解决当信念状态空间较大时出现的"维数灾"问题,得到Markov决策的逼近最优解.实验数据表明该算法是可行的和有效的. 相似文献
8.
大量实际工程问题需要用同时包含连续和离散变量的Markov跳变系统来描述.本文介绍了一类随机激励的单自由度(强)非线性Markov跳变系统的稳态响应的研究方法.首先,基于随机平均法导出具有Markov跳变参数的平均It随机微分方程,原系统方程的维数得到降低.接着,根据跳变过程原理,建立Fokker-Planck-Kolmogorov(FPK)方程组,方程组中的方程与系统的结构状态一一对应且互相耦合.求解该FPK方程组,得到Markov跳变系统的稳态随机响应及其统计量.最后,以一个高斯白噪声激励的Markov跳变Duffing振子为例,计算得到不同跳变规律下系统的稳态响应.研究结果表明,Markov跳变系统的稳态响应可以看作是各结构状态子系统稳态响应的加权和,加权值由跳变规律决定. 相似文献
9.
10.
11.
This paper deals with the infinite horizon linear quadratic(LQ)differential games for discrete-time stochastic systems with both state and control dependent noise.The Popov-Belevitch-Hautus(PBH)criteria for exact observability and exact detectability of discrete-time stochastic systems are presented.By means of them,we give the optimal strategies (Nash equilibrium strategies)and the optimal cost values for infinite horizon stochastic differential games.It indicates that the infinite horizon LQ stochastic differential games are associated with four coupled matrix-valued equations.Furthermore, an iterative algorithm is proposed to solve the four coupled equations.Finally,an example is given to demonstrate our results. 相似文献
12.
Huiying Sun Liuyang Jiang Weihai Zhang 《International Journal of Control, Automation and Systems》2012,10(5):940-946
In this paper, we consider the feedback control on nonzero-sum linear quadratic (LQ) differential games in finite horizon for discrete-time stochastic systems with Markovian jump parameters and multiplicative noise. Four-coupled generalized difference Riccati equations (GDREs) are obtained, which are essential to find the optimal Nash equilibrium strategies and the optimal cost values of the LQ differential games. Furthermore, an iterative algorithm is given to solve the four-coupled GDREs. Finally, a suboptimal solution of the LQ differential games is proposed based on a convex optimization approach and a simplification of the suboptimal solution is given. Simulation examples are presented to illustrate the effectiveness of the iterative algorithm and the suboptimal solution. 相似文献
13.
基于Kleinman 迭代算法的框架, 提出两种数值迭代算法, 用于解决连续时间Markov 跳变系统的优化??∞ 控制器设计问题. 首先, 给出“ 直接并行Kleinman 迭代算法”, 并从正实算子的收敛性证明该算法的收敛性; 然后, 基于直接并行Kleinman 迭代算法, 提出一种更加广义的迭代算法结构, 即“ 广义并行Kleinman 迭代算法”, 并论述其包含的4 种情形; 最后, 通过数值示例验证了所提出算法的有效性.
相似文献14.
A new iterative algorithm for solving H∞ control problem of continuous‐time Markovian jumping linear systems based on online implementation
下载免费PDF全文
![点击此处可从《国际强度与非线性控制杂志<br>》网站下载免费的PDF全文](/ch/ext_images/free.gif)
A new online iterative algorithm for solving the H∞ control problem of continuous‐time Markovian jumping linear systems is developed. For comparison, an available offline iterative algorithm for converging to the solution of the H∞ control problem is firstly proposed. Based on the offline iterative algorithm and a new online decoupling technique named subsystems transformation method, a set of linear subsystems, which implementation in parallel, are obtained. By means of the adaptive dynamic programming technique, the two‐player zero‐sum game with the coupled game algebraic Riccati equation is solved online thereafter. The convergence of the novel policy iteration algorithm is also established. At last, simulation results have illustrated the effectiveness and applicability of these two methods. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
15.
This paper investigates the cluster synchronisation problem for multi-agent non-zero sum differential game with partially unknown dynamics. The objective is to design a controller to achieve the cluster synchronisation and to ensure local optimality of the performance index. With the definition of cluster tracking error and the concept of Nash equilibrium in the multi-agent system (MAS), the previous problem can be transformed into the problem of solving the coupled Hamilton–Jacobi–Bellman (HJB) equations. To solve these HJB equations, a data-based policy iteration algorithm is proposed with an actor–critic neural network (NN) structure in the case of the MAS with partially unknown dynamics; the weights of NNs are updated with the system data rather than the complete knowledge of system dynamics and the residual errors are minimised using the least-square approach. A simulation example is provided to verify the effectiveness of the proposed approach. 相似文献
16.
Qinglai Wei Derong Liu Yancai Xu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2016,20(2):697-706
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm, called “generalized value iteration ADP” algorithm, is developed to solve infinite horizon optimal tracking control problems for a class of discrete-time nonlinear systems. The developed generalized value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. Convergence property is developed to guarantee that the iterative performance index function will converge to the optimum. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the developed algorithm. 相似文献
17.
18.
Model predictive control (MPC) for Markovian jump linear systems with probabilistic constraints has received much attention in recent years. However, in existing results, the disturbance is usually assumed with infinite support, which is not considered reasonable in real applications. Thus, by considering random additive disturbance with finite support, this paper is devoted to a systematic approach to stochastic MPC for Markovian jump linear systems with probabilistic constraints. The adopted MPC law is parameterized by a mode‐dependent feedback control law superimposed with a perturbation generated by a dynamic controller. Probabilistic constraints can be guaranteed by confining the augmented system state to a maximal admissible set. Then, the MPC algorithm is given in the form of linearly constrained quadratic programming problems by optimizing the infinite sum of derivation of the stage cost from its steady‐state value. The proposed algorithm is proved to be recursively feasible and to guarantee constraints satisfaction, and the closed‐loop long‐run average cost is not more than that of the unconstrained closed‐loop system with static feedback. Finally, when adopting the optimal feedback gains in the predictive control law, the resulting MPC algorithm has been proved to converge in the mean square sense to the optimal control. A numerical example is given to verify the efficiency of the proposed results. 相似文献
19.
Kan-Jian Zhang Author Vitae Yan-Kai Xu Author Vitae Xi Chen Author Vitae Xi-Ren Cao Author Vitae 《Automatica》2008,44(4):1055-1061
It is well known that stochastic control systems can be viewed as Markov decision processes (MDPs) with continuous state spaces. In this paper, we propose to apply the policy iteration approach in MDPs to the optimal control problem of stochastic systems. We first provide an optimality equation based on performance potentials and develop a policy iteration procedure. Then we apply policy iteration to the jump linear quadratic problem and obtain the coupled Riccati equations for their optimal solutions. The approach is applicable to linear as well as nonlinear systems and can be implemented on-line on real world systems without identifying all the system structure and parameters. 相似文献