首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
霍煜  王鼎  乔俊飞 《控制与决策》2023,38(11):3066-3074
针对一类具有不确定性的连续时间非线性系统,提出一种基于单网络评判学习的鲁棒跟踪控制方法.首先建立由跟踪误差与参考轨迹构成的增广系统,将鲁棒跟踪控制问题转换为镇定设计问题.通过采用带有折扣因子和特殊效用项的代价函数,将鲁棒镇定问题转换为最优控制问题.然后,通过构建评判神经网络对最优代价函数进行估计,进而得到最优跟踪控制算法.为了放松该算法的初始容许控制条件,在评判神经网络权值更新律中增加一个额外项.利用Lyapunov方法证明闭环系统的稳定性及鲁棒跟踪性能.最后,通过仿真结果验证该方法的有效性和适用性.  相似文献   

2.
设计了一种基于事件的迭代自适应评判算法,用于解决一类非仿射系统的零和博弈最优跟踪控制问题.通过数值求解方法得到参考轨迹的稳定控制,进而将未知非线性系统的零和博弈最优跟踪控制问题转化为误差系统的最优调节问题.为了保证闭环系统在具有良好控制性能的基础上有效地提高资源利用率,引入一个合适的事件触发条件来获得阶段性更新的跟踪策略对.然后,根据设计的触发条件,采用Lyapunov方法证明误差系统的渐近稳定性.接着,通过构建四个神经网络,来促进所提算法的实现.为了提高目标轨迹对应稳定控制的精度,采用模型网络直接逼近未知系统函数而不是误差动态系统.构建评判网络、执行网络和扰动网络用于近似迭代代价函数和迭代跟踪策略对.最后,通过两个仿真实例,验证该控制方法的可行性和有效性.  相似文献   

3.
针对一类带有执行器饱和的未知动态离散时间非线性系统, 提出了一种新的最优跟踪控制方案. 该方案基于迭代自适应动态规划算法, 为了实现最优控制, 首先建立了未知系统动态的数据辨识器. 通过引入M网络, 获得了稳态控制的精确表达式. 为了消除执行器饱和的影响, 提出了一个非二次的性能指标函数. 然后提出了一种迭代自适应动态规划算法获得最优跟踪控制的解, 并给出了收敛性分析. 为了实现最优控制方案, 神经网络被用来构建数据辨识器、计算性能指标函数、近似最优控制策略和求解稳态控制. 仿真结果验证了本文所提出的最优跟踪控制方法的有效性.  相似文献   

4.
针对注射速度控制,设计了一种基于零相位滤波迭代学习的前馈反馈控制方法.该控制算法中的前馈部分采用迭代学习控制,利用以往批次信息实现精确跟踪;反馈部分采用比例控制,用于克服当前批次扰动.为了消除学习算法应用时出现的坏学习瞬态,这里先推导出所提控制算法在频域下的收敛条件,再给出基于零相位滤波的单调递减学习瞬态.经实验验证,这种方法可以使学习控制具有单调递减的学习瞬态曲线,并能很好地满足实际的注射控制要求.  相似文献   

5.
蔡霞  马社祥  孟鑫 《计算机应用研究》2012,29(11):4232-4234
针对传统算法在处理传感器网络的大规模信号时,运算复杂度显著增大,性能急剧下降的问题,提出了启发式同步自适应迭代阈值重构算法。采用启发式差错控制函数选择代价最少的方向逐行同步收缩逼近最优解,并结合由自适应递减幂指数参数所确定的非线性阈值函数,进一步判断修正重构信号。仿真结果表明,启发式同步自适应迭代阈值重构算法以更少的测量值和迭代次数重构信号,其信噪比提高了60 dB。  相似文献   

6.
陶洪峰  李健  杨慧中 《控制与决策》2021,36(6):1435-1441
为解决工业过程中机械臂等特殊重复运行系统的输出在有限时间内无需实现全轨迹跟踪,仅需跟踪期望轨迹上某些特殊关键点的控制问题,针对线性时不变离散系统提出一种基于范数最优的点对点迭代学习控制算法.通过输入输出时间序列矩阵模型变换构建综合性多目标点性能指标函数,求解二次型最优解得到优化迭代学习控制律,同时给出模型标称和不确定情形下最大奇异值形式鲁棒控制算法收敛的充分条件,并进一步推广得到输入约束系统优化控制算法的收敛性结果,最后在三轴龙门机器人模型上验证算法的有效性.  相似文献   

7.
林小峰  丁强 《控制与决策》2015,30(3):495-499
为了求解有限时域最优控制问题,自适应动态规划(ADP)算法要求受控系统能一步控制到零。针对不能一步控制到零的非线性系统,提出一种改进的ADP算法,其初始代价函数由任意的有限时间容许序列构造。推导了算法的迭代过程并证明了算法的收敛性。当考虑评价网络的近似误差并满足假设条件时,迭代代价函数将收敛到最优代价函数的有界邻域。仿真例子验证了所提出方法的有效性。  相似文献   

8.
针对非最小相位系统的跟踪问题,提出了一种新的基函数迭代学习控制算法.该算法利用新型的非因果Laguerre扩展基函数逼近系统逆传递函数,设计最优迭代学习律使系统输入收敛到系统的稳定逆,保证了控制性能.算法不依赖于系统的先验模型,仅需以基函数信号作为系统输入进行模型辨识,减少了模型不确定性的影响.通过对单连杆柔性机械臂这样的典型非最小相位系统跟踪问题的仿真,验证了该方法的良好效果.  相似文献   

9.
讨论一类半Markov控制过程(SMCP)的折扣代价性能优化问题.通过引入一个矩阵,该矩阵可作为一个Markov过程的无穷小矩阵,对一个SMCP定义了折扣Poisson方程,并由这个方程定义了α-势.基于α-势,给出了由最优平稳策略所满足的最优性方程.最后给出一个求解最优平稳策略的迭代算法,并提供一个数值例子以表明该算法的应用.  相似文献   

10.
针对一类存在随机输入状态扰动、输出扰动及系统初值与给定期望值不严格一致的离散非线性重复系统,提出了一种P型开闭环鲁棒迭代学习轨迹跟踪控制算法.基于λ范数理论证明了算法的严格鲁棒稳定性,并通过多目标函数性能指标优化P型开闭环迭代学习控制律的增益矩阵参数,保证了优化算法下系统输出期望轨迹跟踪误差的单调收敛性,达到提高学习算法收敛速度和跟踪精度的目的.最后应用于二维运动移动机器人的实例仿真,验证了本文算法的可行性和有效性.  相似文献   

11.
The core task of tracking control is to make the controlled plant track a desired trajectory. The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of time steps increases. In this paper, a new cost function is introduced to develop the value-iteration-based adaptive critic framework to solve the tracking control problem. Unlike the regulator problem, the iterative value function of tracking control problem cannot be regarded as a Lyapunov function. A novel stability analysis method is developed to guarantee that the tracking error converges to zero. The discounted iterative scheme under the new cost function for the special case of linear systems is elaborated. Finally, the tracking performance of the present scheme is demonstrated by numerical results and compared with those of the traditional approaches.   相似文献   

12.
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm, called “generalized value iteration ADP” algorithm, is developed to solve infinite horizon optimal tracking control problems for a class of discrete-time nonlinear systems. The developed generalized value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. Convergence property is developed to guarantee that the iterative performance index function will converge to the optimum. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the developed algorithm.  相似文献   

13.
In this paper we discuss an online algorithm based on policy iteration for learning the continuous-time (CT) optimal control solution with infinite horizon cost for nonlinear systems with known dynamics. That is, the algorithm learns online in real-time the solution to the optimal control design HJ equation. This method finds in real-time suitable approximations of both the optimal cost and the optimal control policy, while also guaranteeing closed-loop stability. We present an online adaptive algorithm implemented as an actor/critic structure which involves simultaneous continuous-time adaptation of both actor and critic neural networks. We call this ‘synchronous’ policy iteration. A persistence of excitation condition is shown to guarantee convergence of the critic to the actual optimal value function. Novel tuning algorithms are given for both critic and actor networks, with extra nonstandard terms in the actor tuning law being required to guarantee closed-loop dynamical stability. The convergence to the optimal controller is proven, and the stability of the system is also guaranteed. Simulation examples show the effectiveness of the new algorithm.  相似文献   

14.
15.
Based on adaptive dynamic programming (ADP), the fixed-point tracking control problem is solved by a value iteration (Ⅵ) algorithm. First, a class of discrete-time (DT) nonlinear system with disturbance is considered. Second, the convergence of a Ⅵ algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value, and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks (NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method.   相似文献   

16.
In this paper, a novel iterative adaptive dynamic programming (ADP) algorithm, called generalised policy iteration ADP algorithm, is developed to solve optimal tracking control problems for discrete-time nonlinear systems. The idea is to use two iteration procedures, including an i-iteration and a j-iteration, to obtain the iterative tracking control laws and the iterative value functions. By system transformation, we first convert the optimal tracking control problem into an optimal regulation problem. Then the generalised policy iteration ADP algorithm, which is a general idea of interacting policy and value iteration algorithms, is introduced to deal with the optimal regulation problem. The convergence and optimality properties of the generalised policy iteration algorithm are analysed. Three neural networks are used to implement the developed algorithm. Finally, simulation examples are given to illustrate the performance of the present algorithm.  相似文献   

17.
利用数据驱动控制思想,建立一种设计离散时间非线性系统近似最优调节器的迭代神经动态规划方法.提出针对离散时间一般非线性系统的迭代自适应动态规划算法并且证明其收敛性与最优性.通过构建三种神经网络,给出全局二次启发式动态规划技术及其详细的实现过程,其中执行网络是在神经动态规划的框架下进行训练.这种新颖的结构可以近似代价函数及其导函数,同时在不依赖系统动态的情况下自适应地学习近似最优控制律.值得注意的是,这在降低对于控制矩阵或者其神经网络表示的要求方面,明显地改进了迭代自适应动态规划算法的现有结果,能够促进复杂非线性系统基于数据的优化与控制设计的发展.通过两个仿真实验,验证本文提出的数据驱动最优调节方法的有效性.  相似文献   

18.
针对线性时不变离散系统的跟踪问题提出一种高阶参数优化迭代学习控制算法.该算法通过建立考虑了多次迭代误差影响的参数优化目标函数,求解得出优化后的时变学习增益参数.从理论上证明了:对于线性离散时不变系统,该算法在被控对象不满足正定性的松弛条件下仍可保证跟踪误差单调收敛于零.同时,采用之前多次迭代信息的高阶算法具有更好的收敛性和鲁棒性.最后利用一个仿真实例验证了算法的有效性.  相似文献   

19.
介绍输出概率密度函数(PDF)常规的迭代学习控制(ILC)的收敛条件,并利用此条件设计相应的迭代学习律.主要讨论如何解决输出PDF迭代学习控制(ILC)中的过迭代,收敛速度等问题.以离散输出概率密度函教(PDF)控制模型为基础,介绍了直接迭代学习控制算法收敛的必要条件,提出自适应的迭代学习参数调节方法和避免过迭代的迭代结束条件,这些措施能够保证输出PDF的迭代控制收敛且具有较快的收敛速度.仿真结果表明,输出PDF的自适应迭代学习控制具有较快的收敛速度,而学习终止条件能很好地避免过迭代.  相似文献   

20.
In this paper, a finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems is proposed. Through system transformation, the optimal tracking problem is converted into designing a finite-horizon optimal regulator for the tracking error dynamics. Then, with convergence analysis in terms of cost function and control law, the iterative adaptive dynamic programming (ADP) algorithm via heuristic dynamic programming (HDP) technique is introduced to obtain the finite-horizon optimal tracking controller which makes the cost function close to its optimal value within an ?-error bound. Three neural networks are used as parametric structures to implement the algorithm, which aims at approximating the cost function, the control law, and the error dynamics, respectively. Two simulation examples are included to complement the theoretical discussions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号