期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems

Qinglai Wei Derong Liu 《Neural computing & applications》2014,24(6):1355-1367

In this paper, a novel iterative adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. When the iterative control law and iterative performance index function in each iteration cannot be accurately obtained, it is shown that the iterative controls can make the performance index function converge to within a finite error bound of the optimal performance index function. Stability properties are presented to show that the system can be stabilized under the iterative control law which makes the present iterative ADP algorithm feasible for implementation both on-line and off-line. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method. 相似文献

2.

Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays

Qinglai Wei Ding Wang Dehua Zhang 《Neural computing & applications》2013,23(7-8):1851-1863

In this paper, a new dual iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for a class of nonlinear systems with time-delays in state and control variables. The idea is to use the dynamic programming theory to solve the expressions of the optimal performance index function and control. Then, the dual iterative ADP algorithm is introduced to obtain the optimal solutions iteratively, where in each iteration, the performance index function and the system states are both updated. Convergence analysis is presented to prove the performance index function to reach the optimum by the proposed method. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the dual iterative ADP algorithm. Simulation examples are given to demonstrate the validity of the proposed optimal control scheme. 相似文献

3.

Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems

下载免费PDF全文

Guangyu Zhu Xiaolu Li Ranran Sun Yiyuan Yang Peng Zhang 《IEEE/CAA Journal of Automatica Sinica》2023,10(3):781-791

Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems, in this paper, a new iterative adaptive dynamic programming algorithm, which is the discrete-time time-varying policy iteration (DTTV) algorithm, is developed. The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance. The admissibility of the iterative control law is analyzed. The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution. To implement the algorithm, neural networks are employed and a new implementation structure is established, which avoids solving the generalized Bellman equation in each iteration. Finally, the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm, where the mass and pendulum bar length are permitted to be time-varying parameters. The effectiveness of the developed method is illustrated by numerical results and comparisons. 相似文献

4.

A novel optimal tracking control scheme for a class of discrete-time nonlinear systems using generalised policy iteration adaptive dynamic programming algorithm

Qiao Lin Derong Liu 《International journal of systems science》2017,48(3):525-534

In this paper, a novel iterative adaptive dynamic programming (ADP) algorithm, called generalised policy iteration ADP algorithm, is developed to solve optimal tracking control problems for discrete-time nonlinear systems. The idea is to use two iteration procedures, including an i-iteration and a j-iteration, to obtain the iterative tracking control laws and the iterative value functions. By system transformation, we first convert the optimal tracking control problem into an optimal regulation problem. Then the generalised policy iteration ADP algorithm, which is a general idea of interacting policy and value iteration algorithms, is introduced to deal with the optimal regulation problem. The convergence and optimality properties of the generalised policy iteration algorithm are analysed. Three neural networks are used to implement the developed algorithm. Finally, simulation examples are given to illustrate the performance of the present algorithm. 相似文献

5.

A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints 总被引：2，自引：1，他引：1

Ding Wang Derong Liu Dongbin Zhao Yuzhu Huang Dehua Zhang 《Neural computing & applications》2013,22(2):219-227

In this paper, a novel neural-network-based iterative adaptive dynamic programming (ADP) algorithm is proposed. It aims at solving the optimal control problem of a class of nonlinear discrete-time systems with control constraints. By introducing a generalized nonquadratic functional, the iterative ADP algorithm through globalized dual heuristic programming technique is developed to design optimal controller with convergence analysis. Three neural networks are constructed as parametric structures to facilitate the implementation of the iterative algorithm. They are used for approximating at each iteration the cost function, the optimal control law, and the controlled nonlinear discrete-time system, respectively. A simulation example is also provided to verify the effectiveness of the control scheme in solving the constrained optimal control problem. 相似文献

6.

基于折扣广义值迭代的智能最优跟踪及应用验证

王鼎赵明明哈明鸣乔俊飞《自动化学报》2022,48(1):182-193

设计了一种基于折扣广义值迭代的智能算法,用于解决一类复杂非线性系统的最优跟踪控制问题.通过选取合适的初始值,值迭代过程中的代价函数将以单调递减的形式收敛到最优代价函数.基于单调递减的值迭代算法,在不同折扣因子的作用下,讨论了迭代跟踪控制律的可容许性和误差系统的渐近稳定性.为了促进算法的实现,建立一个数据驱动的模型网络用... 相似文献

7.

Neural-network-based approach to finite-time optimal control for a class of unknown nonlinear systems

Ruizhuo Song Wendong Xiao Qinglai Wei Changyin Sun 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2014,18(8):1645-1653

This paper proposes a novel finite-time optimal control method based on input–output data for unknown nonlinear systems using adaptive dynamic programming (ADP) algorithm. In this method, the single-hidden layer feed-forward network (SLFN) with extreme learning machine (ELM) is used to construct the data-based identifier of the unknown system dynamics. Based on the data-based identifier, the finite-time optimal control method is established by ADP algorithm. Two other SLFNs with ELM are used in ADP method to facilitate the implementation of the iterative algorithm, which aim to approximate the performance index function and the optimal control law at each iteration, respectively. A simulation example is provided to demonstrate the effectiveness of the proposed control scheme. 相似文献

8.

Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming

下载免费PDF全文

Qinglai Wei Derong Liu Yu Liu Ruizhuo Song 《IEEE/CAA Journal of Automatica Sinica》2017,4(2):168-176

This paper concerns a novel optimal self-learning battery sequential control scheme for smart home energy systems. The main idea is to use the adaptive dynamic programming (ADP) technique to obtain the optimal battery sequential control iteratively. First, the battery energy management system model is established, where the power efficiency of the battery is considered. Next, considering the power constraints of the battery, a new non-quadratic form performance index function is established, which guarantees that the value of the iterative control law cannot exceed the maximum charging/discharging power of the battery to extend the service life of the battery. Then, the convergence properties of the iterative ADP algorithm are analyzed, which guarantees that the iterative value function and the iterative control law both reach the optimums. Finally, simulation and comparison results are given to illustrate the performance of the presented method. 相似文献

9.

基于自适应动态规划的一类带有时滞的离散时间非线性系统的最优控制策略 总被引：4，自引：3，他引：1

魏庆来张化光刘德荣赵琰《自动化学报》2010,36(1):121-129

针对一类状态和控制变量均带有时滞的非线性系统的带有二次性能指标函数最优控制问题, 本文提出了一种基于新的迭代自适应动态规划算法的最优控制方案. 通过引进时滞矩阵函数, 应用动态规划理论, 本文获得了最优控制的显式表达式, 然后通过自适应评判技术获得最优控制量. 本文给出了收敛性证明以保证性能指标函数收敛到最优. 为了实现所提出的算法, 本文采用神经网络近似性能指标函数、计算最优控制策略、求解时滞矩阵函数、以及给非线性系统建模. 最后本文给出了两个仿真例子说明所提出的最优策略的有效性. 相似文献

10.

Optimal Fixed-Point Tracking Control for Discrete-Time Nonlinear Systems via ADP

下载免费PDF全文

Ruizhuo Song Liao Zhu 《IEEE/CAA Journal of Automatica Sinica》2019,6(3):657-666

Based on adaptive dynamic programming (ADP), the fixed-point tracking control problem is solved by a value iteration (Ⅵ) algorithm. First, a class of discrete-time (DT) nonlinear system with disturbance is considered. Second, the convergence of a Ⅵ algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value, and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks (NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method. 相似文献

11.

基于ADP算法的带时滞及饱和的非线性系统最优控制

林小峰黄元君宋春宁《信息与控制》2012,41(2):185-192

针对控制时滞及带饱和的一类离散时间非线性系统的最优控制问题,通过重构性能指标函数和对应的系统变换,处理了性能指标函数中的控制耦合项;继而引入一个合适的泛函,解决了控制带饱和问题.给出了一个新的性能指标函数,利用迭代自适应动态规划(ADP)算法获得最优控制.为实现该算法,采用神经网络逼近函数来求解最优控制问题.仿真结果验证了方法的有效性. 相似文献

12.

Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming

Qinglai WEI Derong LIU 《控制理论与应用(英文版)》2011,9(3):381-390

In this paper, we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming (ADP) approach. A new ε-optimal control algorithm based on the iterative ADP approach is proposed which makes the performance index function converge iteratively to the greatest lower bound of all performance indices within an error according to ε within finite time. The optimal number of control steps can also be obtained by the proposed ε-optimal control algorithm for the situation where the initial state of the system is unfixed. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the ε-optimal control algorithm. Finally, a simulation example is given to show the results of the proposed method. 相似文献

13.

Data-Based Optimal Tracking of Autonomous Nonlinear Switching Systems

下载免费PDF全文

Xiaofeng Li Lu Dong Changyin Sun 《IEEE/CAA Journal of Automatica Sinica》2021,8(1):227-238

In this paper,a data-based scheme is proposed to solve the optimal tracking problem of autonomous nonlinear switching systems.The system state is forced to track the reference signal by minimizing the performance function.First,the problem is transformed to solve the corresponding Bellman optimality equation in terms of the Q-function(also named as action value function).Then,an iterative algorithm based on adaptive dynamic programming(ADP)is developed to find the optimal solution which is totally based on sampled data.The linear-in-parameter(LIP)neural network is taken as the value function approximator.Considering the presence of approximation error at each iteration step,the generated approximated value function sequence is proved to be boundedness around the exact optimal solution under some verifiable assumptions.Moreover,the effect that the learning process will be terminated after a finite number of iterations is investigated in this paper.A sufficient condition for asymptotically stability of the tracking error is derived.Finally,the effectiveness of the algorithm is demonstrated with three simulation examples. 相似文献

14.

Finite horizon optimal control of non-linear discrete-time switched systems using adaptive dynamic programming with ε-error bound

Chunbin Qin Yanhong Luo Binrui Wang 《International journal of systems science》2014,45(8):1683-1693

In this paper, we aim to solve the finite-horizon optimal control problem for a class of non-linear discrete-time switched systems using adaptive dynamic programming(ADP) algorithm. A new ε-optimal control scheme based on the iterative ADP algorithm is presented which makes the value function converge iteratively to the greatest lower bound of all value function indices within an error according to ε within finite time. Two neural networks are used as parametric structures to implement the iterative ADP algorithm with ε-error bound, which aim at approximating the value function and the control policy, respectively. And then, the optimal control policy is obtained. Finally, a simulation example is included to illustrate the applicability of the proposed method. 相似文献

15.

Neural-Network-Based Control for Discrete-Time Nonlinear Systems with Input Saturation Under Stochastic Communication Protocol

下载免费PDF全文

Xueli Wang Derui Ding Hongli Dong Xian-Ming Zhang 《IEEE/CAA Journal of Automatica Sinica》2021,8(4):766-778

In this paper,an adaptive dynamic programming(ADP)strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation.To save the communication resources between the controller and the actuators,stochastic communication protocols(SCPs)are adopted to schedule the control signal,and therefore the closed-loop system is essentially a protocol-induced switching system.A neural network(NN)-based identifier with a robust term is exploited for approximating the unknown nonlinear system,and a set of switch-based updating rules with an additional tunable parameter of NN weights are developed with the help of the gradient descent.By virtue of a novel Lyapunov function,a sufficient condition is proposed to achieve the stability of both system identification errors and the update dynamics of NN weights.Then,a value iterative ADP algorithm in an offline way is proposed to solve the optimal control of protocol-induced switching systems with saturation constraints,and the convergence is profoundly discussed in light of mathematical induction.Furthermore,an actor-critic NN scheme is developed to approximate the control law and the proposed performance index function in the framework of ADP,and the stability of the closed-loop system is analyzed in view of the Lyapunov theory.Finally,the numerical simulation results are presented to demonstrate the effectiveness of the proposed control scheme. 相似文献

16.

基于LMI方法的保性能迭代学习算法设计 总被引：4，自引：0，他引：4

杨胜跃樊晓平年晓红瞿志华罗安黄深喜《自动化学报》2006,32(4):578-585

研究基于性能的迭代学习算法设计与优化问题.首先定义了迭代域二次型性能函数,然后针对线性离散系统给出了迭代域最优迭代学习算法;基于线性矩阵不等式(LMI)方法,针对不确定线性离散系统给出了保性能迭代学习算法及其优化方法.对于这两类迭代学习算法,只要调整性能函数中的权系数矩阵,便可很好地调整迭代学习收敛速度.另外,保性能迭代学习算法设计及优化过程,可利用MATLAB工具箱很方便地求解. 相似文献

17.

An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games 总被引：7，自引：0，他引：7

Huaguang Zhang Qinglai Wei Derong LiuAuthor vitae 《Automatica》2011,(1):207-214

In this paper, a new iterative adaptive dynamic programming (ADP) method is proposed to solve a class of continuous-time nonlinear two-person zero-sum differential games. The idea is to use the ADP technique to obtain the optimal control pair iteratively which makes the performance index function reach the saddle point of the zero-sum differential games. If the saddle point does not exist, the mixed optimal control pair is obtained to make the performance index function reach the mixed optimum. Stability analysis of the nonlinear systems is presented and the convergence property of the performance index function is also proved. Two simulation examples are given to illustrate the performance of the proposed method. 相似文献

18.

基于数据ADP算法的一类带有执行器饱和的未知离散时间系统最优跟踪控制

宋睿卓肖文栋孙长银《自动化学报》2013,39(9):1413-1420

针对一类带有执行器饱和的未知动态离散时间非线性系统, 提出了一种新的最优跟踪控制方案. 该方案基于迭代自适应动态规划算法, 为了实现最优控制, 首先建立了未知系统动态的数据辨识器. 通过引入M网络, 获得了稳态控制的精确表达式. 为了消除执行器饱和的影响, 提出了一个非二次的性能指标函数. 然后提出了一种迭代自适应动态规划算法获得最优跟踪控制的解, 并给出了收敛性分析. 为了实现最优控制方案, 神经网络被用来构建数据辨识器、计算性能指标函数、近似最优控制策略和求解稳态控制. 仿真结果验证了本文所提出的最优跟踪控制方法的有效性. 相似文献

19.

The finite-horizon optimal control for a class of time-delay affine nonlinear system 总被引：1，自引：1，他引：0

Ruizhuo Song Huaguang Zhang 《Neural computing & applications》2013,22(2):229-235

In this paper, a new iteration algorithm is proposed to solve the finite-horizon optimal control problem for a class of time-delay affine nonlinear systems with known system dynamic. First, we prove that the algorithm is convergent as the iteration step increases. Then, a theorem is presented to demonstrate that the limit of the iteration performance index function satisfies discrete-time Hamilton–Jacobi–Bellman (DTHJB) equation, and the finite-horizon iteration algorithm is presented with satisfactory accuracy error. At last, two neural networks are used to approximate the iteration performance index function and the corresponding control policy. In simulation part, an example is given to demonstrate the effectiveness of the proposed iteration algorithm. 相似文献

20.

基于并行Kleinman 迭代算法的Markov 跳变系统优化??_∞ 控制

下载免费PDF全文

宋军何舒平《控制与决策》2016,31(3):559-563

基于Kleinman 迭代算法的框架, 提出两种数值迭代算法, 用于解决连续时间Markov 跳变系统的优化??_∞ 控制器设计问题. 首先, 给出“ 直接并行Kleinman 迭代算法”, 并从正实算子的收敛性证明该算法的收敛性; 然后, 基于直接并行Kleinman 迭代算法, 提出一种更加广义的迭代算法结构, 即“ 广义并行Kleinman 迭代算法”, 并论述其包含的4 种情形; 最后, 通过数值示例验证了所提出算法的有效性.

相似文献