首页 | 本学科首页   官方微博 | 高级检索  
     

非零和微分博弈系统的事件触发最优跟踪控制(英文)
引用本文:石义博,王朝立.非零和微分博弈系统的事件触发最优跟踪控制(英文)[J].控制理论与应用,2023,40(2):220-230.
作者姓名:石义博  王朝立
作者单位:上海理工大学,上海理工大学
基金项目:Supported by the National Defense Basic Research Program (JCKY2019413D001);;the Shanghai Natural Science Foundation (19ZR1436000);
摘    要:近年来,对于具有未知动态的非零和微分博弈系统的跟踪问题,已经得到了讨论,然而这些方法是时间触发的,在传输带宽和计算资源有限的环境下并不适用.针对具有未知动态的连续时间非线性非零和微分博弈系统,本文提出了一种基于积分强化学习的事件触发自适应动态规划方法.该策略受梯度下降法和经验重放技术的启发,利用历史和当前数据更新神经网络权值.该方法提高了神经网络权值的收敛速度,消除了一般文献设计中常用的初始容许控制假设.同时,该算法提出了一种易于在线检查的持续激励条件(通常称为PE),避免了传统的不容易检查的持续激励条件.基于李亚普诺夫理论,证明了跟踪误差和评价神经网络估计误差的一致最终有界性.最后,通过一个数值仿真实例验证了该方法的可行性.

关 键 词:非零和博弈  积分强化学习  最优跟踪控制  神经网络  事件触发
收稿时间:2021/12/29 0:00:00
修稿时间:2022/10/25 0:00:00

Event-triggered optimal tracking control for nonzero-sum differential game systems
SHI Yi-bo and WANG Chao-li.Event-triggered optimal tracking control for nonzero-sum differential game systems[J].Control Theory & Applications,2023,40(2):220-230.
Authors:SHI Yi-bo and WANG Chao-li
Affiliation:University of Shanghai for Science and Technology,University of Shanghai for Science and Technology
Abstract:Recently, for the tracking problem of nonzero-sum differential game systems with unknown dynamics, it has been discussed that these methods are time-triggered, which is not ideal in an environment with limited transmission bandwidth and computing resources. In this paper, an integral reinforcement learning based event-triggered adaptive dynamic programming scheme is developed for continuous-time nonlinear nonzero-sum differential game systems with unknown dynamics. The strategy is inspired by the gradient descent method and the experience replay technique and uses the historical and current data to update the neural network weight. This method can improve the convergence speed of neural network weight and remove the assumption of initial admissible control often used in general literature design. In the meantime, the algorithm proposes a persistent excitation condition (commonly called PE) that is easy to check online, which avoids the traditional PE condition that is not easy to check. Based on the Lyapunov theory, the uniform ultimate boundedness (UUB) properties of the tracking error and the critic neural network estimation error have been proved. Finally, a numerical simulation example is given to verify the feasibility of the proposed method.
Keywords:nonzero-sum games  integral reinforcement learning  optimal tracking control  neural network  event triggered
点击此处可从《控制理论与应用》浏览原始摘要信息
点击此处可从《控制理论与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号