首页 | 本学科首页   官方微博 | 高级检索  
     

不对称约束多人非零和博弈的自适应评判控制
引用本文:李梦花,王鼎,乔俊飞. 不对称约束多人非零和博弈的自适应评判控制[J]. 控制理论与应用, 2023, 40(9): 1562-1568
作者姓名:李梦花  王鼎  乔俊飞
作者单位:北京工业大学 信息学部,北京工业大学 信息学部,北京工业大学 信息学部
基金项目:科技创新2030 –“新一代人工智能”重大项目(2021ZD0112302, 2021ZD0112301), 国家重点研发计划项目(2018YFC1900800–5), 北京市自然科学 基金项目(JQ19013), 国家自然科学基金项目(62222301, 61890930–5, 62021003)
摘    要:本文针对连续时间非线性系统的不对称约束多人非零和博弈问题, 建立了一种基于神经网络的自适应评判控制方法. 首先, 本文提出了一种新颖的非二次型函数来处理不对称约束问题, 并且推导出最优控制律和耦合Hamilton-Jacobi方程. 值得注意的是, 当系统状态为零时, 最优控制策略是不为零的, 这与以往不同. 然后, 通过构建单一评判网络来近似每个玩家的最优代价函数, 从而获得相关的近似最优控制策略. 同时, 在评判学习期间发展了一种新的权值更新规则. 此外, 通过利用Lyapunov理论证明了评判网络权值近似误差和闭环系统状态的稳定性. 最后, 仿真结果验证了本文所提方法的有效性

关 键 词:神经网络   自适应评判控制   自适应动态规划   非线性系统   不对称约束   多人非零和博弈
收稿时间:2022-01-21
修稿时间:2023-07-14

Adaptive critic control for multi-player non-zero-sum games with asymmetric constraints
LI Meng-hu,WANG Ding and QIAO Jun-fei. Adaptive critic control for multi-player non-zero-sum games with asymmetric constraints[J]. Control Theory & Applications, 2023, 40(9): 1562-1568
Authors:LI Meng-hu  WANG Ding  QIAO Jun-fei
Affiliation:Faculty of Information Technology, Beijing University of Technology,Faculty of Information Technology, Beijing University of Technology,Faculty of Information Technology, Beijing University of Technology
Abstract:In this paper, an adaptive critic control method based on the neural networks is established for multi-player non-zero-sum games with asymmetric constraints of continuous-time nonlinear systems. First, a novel nonquadratic func-tion is proposed to deal with asymmetric constraints, and then the optimal control laws and the coupled Hamilton-Jacobi equations are derived. It is worth noting that the optimal control strategies do not stay at zero when the system state is zero, which is different from the past. After that, only a critic network is constructed to approximate the optimal cost function for each player, so as to obtain the associated approximate optimal control strategies. Meanwhile, a new weight updating rule is developed during critic learning. In addition, the stability of the weight estimation errors of critic networks and the closed-loop system state is proved by utilizing the Lyapunov method. Finally, simulation results verify the effectiveness of the method proposed in this paper
Keywords:neural networks   adaptive critic control   adaptive dynamic programming   nonlinear systems   asymmetric constraints   multi-player non-zero-sum games
点击此处可从《控制理论与应用》浏览原始摘要信息
点击此处可从《控制理论与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号