首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This article proposes three novel time-varying policy iteration algorithms for finite-horizon optimal control problem of continuous-time affine nonlinear systems. We first propose a model-based time-varying policy iteration algorithm. The method considers time-varying solutions to the Hamiltonian–Jacobi–Bellman equation for finite-horizon optimal control. Based on this algorithm, value function approximation is applied to the Bellman equation by establishing neural networks with time-varying weights. A novel update law for time-varying weights is put forward based on the idea of iterative learning control, which obtains optimal solutions more efficiently compared to previous works. Considering that system models may be unknown in real applications, we propose a partially model-free time-varying policy iteration algorithm that applies integral reinforcement learning to acquiring the time-varying value function. Moreover, analysis of convergence, stability, and optimality is provided for every algorithm. Finally, simulations for different cases are given to verify the convenience and effectiveness of the proposed algorithms.  相似文献   

2.
In this paper, a new dual iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for a class of nonlinear systems with time-delays in state and control variables. The idea is to use the dynamic programming theory to solve the expressions of the optimal performance index function and control. Then, the dual iterative ADP algorithm is introduced to obtain the optimal solutions iteratively, where in each iteration, the performance index function and the system states are both updated. Convergence analysis is presented to prove the performance index function to reach the optimum by the proposed method. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the dual iterative ADP algorithm. Simulation examples are given to demonstrate the validity of the proposed optimal control scheme.  相似文献   

3.
Orthogonal function approach (OFA) and the hybrid Taguchi-genetic algorithm (HTGA) are used to solve quadratic finite-horizon optimal controller design problems in both a fuzzy parallel distributed compensation (PDC) controller and a non-PDC controller (linear state feedback controller) for Takagi–Sugeno (TS) fuzzy-model-based control systems for dynamic ship positioning systems (TS-DSPS). Based on the OFA, an algorithm requiring only algebraic computation is used to solve dynamic equations for TS-fuzzy-model-based feedback and is then integrated with HTGA to design quadratic finite-horizon optimal controllers for TS-DSPS under the criterion of minimizing a quadratic finite-horizon integral performance index, which is also converted to algebraic form by the OFA. Integration of OFA and HTGA in the proposed approach enables use of simple algebraic computation and is well adapted to the computer implementation. Therefore, it facilitates design tasks of quadratic finite-horizon optimal controllers for the TS-DSPS. The applicability of the proposed approach is demonstrated in the example of a moored tanker designed using quadratic finite-horizon optimal controllers.  相似文献   

4.
This paper proposes a novel finite-time optimal control method based on input–output data for unknown nonlinear systems using adaptive dynamic programming (ADP) algorithm. In this method, the single-hidden layer feed-forward network (SLFN) with extreme learning machine (ELM) is used to construct the data-based identifier of the unknown system dynamics. Based on the data-based identifier, the finite-time optimal control method is established by ADP algorithm. Two other SLFNs with ELM are used in ADP method to facilitate the implementation of the iterative algorithm, which aim to approximate the performance index function and the optimal control law at each iteration, respectively. A simulation example is provided to demonstrate the effectiveness of the proposed control scheme.  相似文献   

5.
In this paper, a novel iterative adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. When the iterative control law and iterative performance index function in each iteration cannot be accurately obtained, it is shown that the iterative controls can make the performance index function converge to within a finite error bound of the optimal performance index function. Stability properties are presented to show that the system can be stabilized under the iterative control law which makes the present iterative ADP algorithm feasible for implementation both on-line and off-line. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.  相似文献   

6.
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm, called “generalized value iteration ADP” algorithm, is developed to solve infinite horizon optimal tracking control problems for a class of discrete-time nonlinear systems. The developed generalized value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. Convergence property is developed to guarantee that the iterative performance index function will converge to the optimum. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the developed algorithm.  相似文献   

7.
针对粒子群优化算法(particle swarm optimization algorithm,PSO)后期易陷入局部最优解这一缺陷,提出一种惯性权重余弦调整的粒子群优化算法(IWCPSO)。在迭代过程中对惯性权重引入余弦变化,改善迭代后期的不足,提高算法的精度。在matlab 2016仿真环境下,与Ziegler-Nichols(ZN)公式法和惯性权重正弦调整的粒子群优化算法(SIPSO)在PID控制参数优化方面的应用效果对比得出该算法是一种使得PID控制系统响应函数性能指标更好,整定结果更精确的算法。  相似文献   

8.
In this paper, we present a novel parametric iterative learning control (ILC) algorithm to deal with trajectory tracking problems for a class of nonlinear autonomous agents that are subject to actuator faults. Unlike most of the ILC literature, the desired trajectories in this work can be iteration dependent, and the initial position of the agent in each iteration can be random. Both parametric and nonparametric system unknowns and uncertainties, in particular the control input gain functions that are not fully known, are considered. A new type of universal barrier functions is proposed to guarantee the satisfaction of asymmetric constraint requirements, feasibility of the controller, and prescribed tracking performance. We show that under the proposed algorithm, the distance and angle tracking errors can uniformly converge to an arbitrarily small positive number and zero, respectively, over the iteration domain, beyond a small user‐prescribed initial time interval in each iteration. A numerical simulation is presented in the end to demonstrate the efficacy of the proposed algorithm.  相似文献   

9.
针对一类状态和控制变量均带有时滞的非线性系统的带有二次性能指标函数最优控制问题, 本文提出了一种基于新的迭代自适应动态规划算法的最优控制方案. 通过引进时滞矩阵函数, 应用动态规划理论, 本文获得了最优控制的显式表达式, 然后通过自适应评判技术获得最优控制量. 本文给出了收敛性证明以保证性能指标函数收敛到最优. 为了实现所提出的算法, 本文采用神经网络近似性能指标函数、计算最优控制策略、求解时滞矩阵函数、以及给非线性系统建模. 最后本文给出了两个仿真例子说明所提出的最优策略的有效性.  相似文献   

10.
Cai  Yuliang  Zhang  Huaguang  Zhang  Kun  Liu  Chong 《Neural computing & applications》2020,32(13):8763-8781

In this paper, a novel online iterative scheme, based on fuzzy adaptive dynamic programming, is proposed for distributed optimal leader-following consensus of heterogeneous nonlinear multi-agent systems under directed communication graph. This scheme combines game theory, adaptive dynamic programming together with generalized fuzzy hyperbolic model (GFHM). Firstly, based on precompensation technique, an appropriate model transformation is proposed to convert the error system into augmented error system, and an exquisite performance index function is defined for this system. Secondly, on the basis of Hamilton–Jacobi–Bellman (HJB) equation, the optimal consensus control is designed and a novel policy iteration (PI) algorithm is put forward to learn the solutions of the HJB equation online. Here, the proposed PI algorithm is implemented on account of GFHMs. Compared with dual-network model including critic network and action network, the proposed scheme only requires critic network. Thirdly, the augmented consensus error of each agent and the weight estimation error of each GFHM are proved to be uniformly ultimately bounded, and the stability of our method has been verified. Finally, some numerical examples and application examples are conducted to demonstrate the effectiveness of the theoretical results.

  相似文献   

11.
In this paper, a fuzzy c‐means clustering algorithm based on interval‐valued weights is proposed for improving clustering performance. In the proposed algorithm, the interval‐valued weights are first constructed by synergy of the ReliefF algorithm and the analytic hierarchy process (AHP) method, and then they are transformed into a constraint condition associating with each weight variable in the weighted clustering objective function. In the sequence, the weighted clustering objective function is solved by combining the Lagrange multiplier method with the gradient‐based iteration computation. In the whole process of algorithm iteration, a compulsion strategy with human–computer cooperation is adopted to ensure each weight variable satisfies interval constraint itself. Three well‐known data set are used to perform profound experiments. Experimental results clearly show that the proposed algorithm has better clustering performance than other the weighted fuzzy c‐means clustering algorithm.  相似文献   

12.
MPC or model predictive control is representative of control methods which are able to handle inequality constraints. Closed-loop stability can therefore be ensured only locally in the presence of constraints of this type. However, if the system is neutrally stable, and if the constraints are imposed only on the input, global asymptotic stability can be obtained; until recently, use of infinite horizons was thought to be inevitable in this case. A globally stabilizing finite-horizon MPC has lately been suggested for neutrally stable continuous-time systems using a non-quadratic terminal cost which consists of cubic as well as quadratic functions of the state. The idea originates from the so-called small gain control, where the global stability is proven using a non-quadratic Lyapunov function. The newly developed finite-horizon MPC employs the same form of Lyapunov function as the terminal cost, thereby leading to global asymptotic stability. A discrete-time version of this finite-horizon MPC is presented here. Furthermore, it is proved that the closed-loop system resulting from the proposed MPC is ISS (Input-to-State Stable), provided that the external disturbance is sufficiently small. The proposed MPC algorithm is also coded using an SQP (Sequential Quadratic Programming) algorithm, and simulation results are given to show the effectiveness of the method.  相似文献   

13.
文章提出了一种新的模糊神经网络(FNN:FuzzyNeuralNetwork)控制的变步长盲均衡算法,利用模糊神经网络控制盲均衡算法的迭代步长,以得到更好的均衡性能。该文设计出模糊神经网络控制器的结构并给出状态方程,提出了新的代价函数,推导出控制器参数的迭代公式。计算机仿真表明,该算法与传统恒模(CMA:ConstantModulusAlgorithm)盲均衡算法相比,具有稳定性好的优点。  相似文献   

14.
In this paper, we aim to solve the finite-horizon optimal control problem for a class of non-linear discrete-time switched systems using adaptive dynamic programming(ADP) algorithm. A new ε-optimal control scheme based on the iterative ADP algorithm is presented which makes the value function converge iteratively to the greatest lower bound of all value function indices within an error according to ε within finite time. Two neural networks are used as parametric structures to implement the iterative ADP algorithm with ε-error bound, which aim at approximating the value function and the control policy, respectively. And then, the optimal control policy is obtained. Finally, a simulation example is included to illustrate the applicability of the proposed method.  相似文献   

15.
This paper gives a self-contained presentation of minimax control for discrete-time time-varying stochastic systems under finite- and infinite-horizon expected total cost performance criteria. Suitable conditions for the existence of minimax strategies are proposed. Also, we prove that the values of the finite-horizon problem converge to the values of the infinite-horizon problems. Moreover, for finite-horizon problems an algorithm of calculation of minimax strategies is developed and tested by using time-varying stochastic systems.  相似文献   

16.
In this paper, a finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems is proposed. Through system transformation, the optimal tracking problem is converted into designing a finite-horizon optimal regulator for the tracking error dynamics. Then, with convergence analysis in terms of cost function and control law, the iterative adaptive dynamic programming (ADP) algorithm via heuristic dynamic programming (HDP) technique is introduced to obtain the finite-horizon optimal tracking controller which makes the cost function close to its optimal value within an ?-error bound. Three neural networks are used as parametric structures to implement the algorithm, which aims at approximating the cost function, the control law, and the error dynamics, respectively. Two simulation examples are included to complement the theoretical discussions.  相似文献   

17.
In this paper, the dissipative control problem is investigated for a class of discrete time-varying systems with simultaneous presence of state saturations, randomly occurring nonlinearities as well as multiple missing measurements. In order to render more practical significance of the system model, some Bernoulli distributed white sequences with known conditional probabilities are adopted to describe the phenomena of the randomly occurring nonlinearities and the multiple missing measurements. The purpose of the addressed problem is to design a time-varying output-feedback controller such that the dissipativity performance index is guaranteed over a given finite-horizon. By introducing a free matrix with its infinity norm less than or equal to 1, the system state is bounded by a convex hull so that some sufficient conditions can be obtained in the form of recursive nonlinear matrix inequalities. A novel controller design algorithm is then developed to deal with the recursive nonlinear matrix inequalities. Furthermore, the obtained results are extended to the case when the state saturation is partial. Two numerical simulation examples are provided to demonstrate the effectiveness and applicability of the proposed controller design approach.  相似文献   

18.
基于数据自适应评判的离散2-D系统零和博弈最优控制   总被引:1,自引:1,他引:0  
提出了基于一种迭代自适应评判设计(ACD)算法解决一类离散时间Roesser型2-D系统的二人零和对策问题. 文章主要思想是采用自适应评判技术迭代的获得最优控制对使得性能指标函数达到零和对策的鞍点. 所提出的ACD可以通过输入输出数据进行实现而不需要系统的模型. 为了实现迭代ACD算法, 神经网络分别用来近似性能指标函数和计算最优控制率. 最后最优控制策略将应用到空气干燥过程控制中以证明其有效性.  相似文献   

19.
设计了一种基于折扣广义值迭代的智能算法,用于解决一类复杂非线性系统的最优跟踪控制问题.通过选取合适的初始值,值迭代过程中的代价函数将以单调递减的形式收敛到最优代价函数.基于单调递减的值迭代算法,在不同折扣因子的作用下,讨论了迭代跟踪控制律的可容许性和误差系统的渐近稳定性.为了促进算法的实现,建立一个数据驱动的模型网络用...  相似文献   

20.
在系统模型参数未知的最优控制问题中, 策略迭代能否快速收敛到最优控制策略的关键在于值函数的估计. 为了提升值函数的估计精度以及收敛速度, 本文提出一种窗口长度自适应调整的策略迭代最优控制算法. 充分利用一段时间内的历史样本数据, 通过影响力函数构建窗口长度与值函数估计性能之间的定量关系, 根据数据窗口长度对估计性能影响力的不同, 实现窗口长度的自适应调整. 最后, 将本文所提方法应用到连续发酵过程, 结果表明, 本文所提方法能够加快最优控制策略的收敛, 克服参数变化或外部扰动对控制性能的影响, 从而提升控制精度.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号