首页 | 本学科首页   官方微博 | 高级检索  
     

基于策略迭代算法的连续时间线性Markov跳变系统非零和微分反馈Nash控制
引用本文:朱国政,张茂光,何舒平.基于策略迭代算法的连续时间线性Markov跳变系统非零和微分反馈Nash控制[J].控制理论与应用,2020,37(8):1749-1756.
作者姓名:朱国政  张茂光  何舒平
作者单位:安徽大学 电气工程与自动化学院, 安徽 合肥 230601;安徽大学 计算智能与信号处理教育部重点实验室, 安徽 合肥 230601
基金项目:国家自然科学基金项目(61673001), 安徽省杰出青年基金项目(1608085J05), 安徽省高校优秀青年人才支持重点项目(gxydZD2017001)资助
摘    要:针对一类连续时间线性Markov跳变系统,本文提出了一种新的策略迭代算法用于求解系统的非零和微分反馈Nash控制问题.通过求解耦合的数值迭代解,以获得具有线性动力学特性和无限时域二次成本的双层非零和微分策略的Nash均衡解.在每一个策略层,采用策略迭代算法来计算与每一组给定的反馈控制策略相关联的最小无限时域值函数.然后,通过子系统分解将Markov跳变系统分解为N个并行的子系统,并将该算法应用于跳变系统.本文提出的策略迭代算法可以很容易求解非零和微分策略所对应的耦合代数Riccati方程,且对高维系统有效.最后通过仿真示例证明了本文设计方法的有效性和可行性.

关 键 词:策略迭代  Markov跳变线性系统  非零和  微分反馈Nash策略
收稿时间:2019/7/23 0:00:00
修稿时间:2020/1/20 0:00:00

Policy iteration-based non-zero sum differential feedback Nash control for continuous-time Markov jump linear systems
Zhu Guo-zheng,Zhang Mao-guang and He Shu-ping.Policy iteration-based non-zero sum differential feedback Nash control for continuous-time Markov jump linear systems[J].Control Theory & Applications,2020,37(8):1749-1756.
Authors:Zhu Guo-zheng  Zhang Mao-guang and He Shu-ping
Affiliation:Anhui University, China,Anhui University, China,Anhui University, China
Abstract:In this paper, a new policy iterative algorithm is proposed to solve the non-zero sum differential feedback Nash control problems for a class of continuous-time Markov jump linear systems. The Nash equilibrium solution of a double-layer non-zero sum differential policy with linear dynamics and infinite time-domain secondary cost is found by solving the coupled numerical iteration solutions. At each policy layer, an policy iterative algorithm is used to calculate the minimum infinite time-domain value function associated with the set of given feedback control strategies. Then, Markov jump linear systems is decomposed into N parallel subsystems by subsystems transformation. And the algorithm is applied to jump systems. The policy iteration algorithm proposed in this paper can easily solve the coupled algebraic Riccati equations corresponding to the non-zero and differential policy. It is effective for high-dimensional systems. Finally, a simulation example is given to prove the effectiveness and feasibility of the design method.
Keywords:policy iteration  Markov jump linear systems  non-zero sum  differential feedback Nash strategy
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《控制理论与应用》浏览原始摘要信息
点击此处可从《控制理论与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号