首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A second-order algorithm is presented for the solution of continuous-time nonlinear optimal control problems. The algorithm is an adaptation of the trust region modifications of Newton's method and solves at each iteration a linear-quadratic control problem with an additional constraint. Under some assumptions, the proposed algorithm is shown to possess a global convergence property. A numerical example is presented to illustrate the method.  相似文献   

2.
In this paper, we formulate a numerical method to approximate the solution of two-dimensional optimal control problem with a fractional parabolic partial differential equation (PDE) constraint in the Caputo type. First, the optimal conditions of the optimal control problems are derived. Then, we discretize the spatial derivatives and time derivatives terms in the optimal conditions by using shifted discrete Legendre polynomials and collocations method. The main idea is simplifying the optimal conditions to a system of algebraic equations. In fact, the main privilege of this new type of discretization is that the numerical solution is directly and globally obtained by solving one efficient algebraic system rather than step-by-step process which avoids accumulation and propagation of error. Several examples are tested and numerical results show a good agreement between exact and approximate solutions.  相似文献   

3.
《Automatica》2014,50(12):3281-3290
This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique. It is known that the nonlinear optimal control problem relies on the solution of the Hamilton–Jacobi–Bellman (HJB) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, most practical systems are too complicated to establish an accurate mathematical model. To overcome these difficulties, we propose a data-based approximate policy iteration (API) method by using real system data rather than a system model. Firstly, a model-free policy iteration algorithm is derived and its convergence is proved. The implementation of the algorithm is based on the actor–critic structure, where actor and critic neural networks (NNs) are employed to approximate the control policy and cost function, respectively. To update the weights of actor and critic NNs, a least-square approach is developed based on the method of weighted residuals. The data-based API is an off-policy RL method, where the “exploration” is improved by arbitrarily sampling data on the state and input domain. Finally, we test the data-based API control design method on a simple nonlinear system, and further apply it to a rotational/translational actuator system. The simulation results demonstrate the effectiveness of the proposed method.  相似文献   

4.
This paper extends the application of shifted Legendre polynomial expansion to time-varying systems. The extension is achieved through representing the product of two shifted Legendre series in a new shifted Legendre series. With this treatment of the product of two time functions, the operational properties of the shifted Legendre polynomials are fully applied to the analysis and optimal control of time-varying linear systems with quadratic performance index.  相似文献   

5.
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming algorithm with a dual neural network (NN) approximation structure. First, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is employed to approximate the optimal value function. Then, the optimal control law is computed based on the information from the identifier NN and the critic NN, so that the actor NN is not needed. In particular, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simultaneously, which converge to small neighbourhoods around their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods.  相似文献   

6.
最优控制问题的Legendre 伪谱法求解及其应用   总被引:1,自引:0,他引:1  
伪谱法通过全局插值多项式参数化状态和控制变量,将最优控制问题(OCP)转化为非线性规划问题(NLP)进行求解,是一类具有更高求解效率的直接法。总结Legendre伪谱法转化Bolza型最优控制问题的基本框架,推导OCP伴随变量与NLP问题KKT乘子的映射关系,建立基于拟牛顿法的LGL配点数值计算方法,并针对非光滑系统,进一步研究分段伪谱逼近策略。基于上述理论开发通用OCP求解器,并对3个典型最优控制问题进行求解,结果表明了所提出方法和求解器的有效性。  相似文献   

7.
This paper develops an online algorithm based on policy iteration for optimal control with infinite horizon cost for continuous-time nonlinear systems. In the present method, a discounted value function is employed, which is considered to be a more general case for optimal control problems. Meanwhile, without knowledge of the internal system dynamics, the algorithm can converge uniformly online to the optimal control, which is the solution of the modified Hamilton–Jacobi–Bellman equation. By means of two neural networks, the algorithm is able to find suitable approximations of both the optimal control and the optimal cost. The uniform convergence to the optimal control is shown, guaranteeing the stability of the nonlinear system. A simulation example is provided to illustrate the effectiveness and applicability of the present approach.  相似文献   

8.
A method is proposed to determine the optimal feedback control law of a class of nonlinear optimal control problems. The method is based on two steps. The first step is to determine the open-hop optimal control and trajectories, by using the quasilinearization and the state variables parametrization via Chebyshev polynomials of the first type. Therefore the nonlinear optimal control problem is replaced by a sequence of small quadratic programming problems which can easily be solved. The second step is to use the results of the last quasilinearization iteration, when an acceptable convergence error is achieved, to obtain the optimal feedback control law. To this end, the matrix Riccati equation and another n linear differential equations are solved using the Chebyshev polynomials of the first type. Moreover, the differentiation operational matrix of Chebyshev polynomials is introduced. To show the effectiveness of the proposed method, the simulation results of a nonlinear optimal control problem are shown.  相似文献   

9.
This article proposes three novel time-varying policy iteration algorithms for finite-horizon optimal control problem of continuous-time affine nonlinear systems. We first propose a model-based time-varying policy iteration algorithm. The method considers time-varying solutions to the Hamiltonian–Jacobi–Bellman equation for finite-horizon optimal control. Based on this algorithm, value function approximation is applied to the Bellman equation by establishing neural networks with time-varying weights. A novel update law for time-varying weights is put forward based on the idea of iterative learning control, which obtains optimal solutions more efficiently compared to previous works. Considering that system models may be unknown in real applications, we propose a partially model-free time-varying policy iteration algorithm that applies integral reinforcement learning to acquiring the time-varying value function. Moreover, analysis of convergence, stability, and optimality is provided for every algorithm. Finally, simulations for different cases are given to verify the convenience and effectiveness of the proposed algorithms.  相似文献   

10.
This contribution presents a new approach for the numeric computation of the input-output linearizing feedback law, which is obtained exactly in an analytical form. By using a state space embedding technique the nonlinear system to be controlled is described by a higher order system with solely polynomial nonlinearities. Consequently, the nonlinearities of this system can be represented by multivariable Legendre polynomials, so that the derivation of the input-output linearizing feedback controller can be accomplished using the operational matrices of multiplication and of differentiation for Legendre polynomials.  相似文献   

11.
The transformation into discrete-time equivalents of digital optimal control problems, involving continuous-time linear systems with white stochastic parameters, and quadratic integral criteria, is considered. The system parameters have time-varying statistics. The observations available at the sampling instants are in general nonlinear and corrupted by discrete-time noise. The equivalent discrete-time system has white stochastic parameters. Expressions are derived for the first and second moment of these parameters and for the parameters of the equivalent discrete-time sum criterion, which are explicit in the parameters and statistics of the original digital optimal control problem. A numerical algorithm to compute these expressions is presented. For each sampling interval, the algorithm computes the expressions recursively, forward in time, using successive equidistant evaluations of the matrices which determine the original digital optimal control problem. The algorithm is illustrated with three examples. If the observations at the sampling instants are linear and corrupted by multiplicative and/or additive discrete-time white noise, then, using recent results, full and reduced-order controllers that solve the equivalent discrete-time optimal control problem can be computed.  相似文献   

12.
In this paper we discuss an online algorithm based on policy iteration for learning the continuous-time (CT) optimal control solution with infinite horizon cost for nonlinear systems with known dynamics. That is, the algorithm learns online in real-time the solution to the optimal control design HJ equation. This method finds in real-time suitable approximations of both the optimal cost and the optimal control policy, while also guaranteeing closed-loop stability. We present an online adaptive algorithm implemented as an actor/critic structure which involves simultaneous continuous-time adaptation of both actor and critic neural networks. We call this ‘synchronous’ policy iteration. A persistence of excitation condition is shown to guarantee convergence of the critic to the actual optimal value function. Novel tuning algorithms are given for both critic and actor networks, with extra nonstandard terms in the actor tuning law being required to guarantee closed-loop dynamical stability. The convergence to the optimal controller is proven, and the stability of the system is also guaranteed. Simulation examples show the effectiveness of the new algorithm.  相似文献   

13.
Mazen Alamir 《Automatica》2006,42(9):1593-1598
In this paper, a benchmark problem is proposed in order to assess comparisons between different optimal control problem solvers for hybrid nonlinear systems. The model is nonlinear with 20 states, 4 continuous controls, 1 discrete binary control and 4 configurations. Transitions between configurations lead to state jumps. The system is inspired by the simulated moving bed, a counter-current separation process.  相似文献   

14.
《Automatica》2014,50(12):2987-2997
This paper focuses on a non-standard constrained nonlinear optimal control problem in which the objective functional involves an integration over a space of stochastic parameters as well as an integration over the time domain. The research is inspired by the problem of optimizing the trajectories of multiple searchers attempting to detect non-evading moving targets. In this paper, we propose a framework based on the approximation of the integral in the parameter space for the considered uncertain optimal control problem. The framework is proved to produce a zeroth-order consistent approximation in the sense that accumulation points of a sequence of optimal solutions to the approximate problem are optimal solutions of the original problem. In addition, we demonstrate the convergence of the corresponding adjoint variables. The accumulation points of a sequence of optimal state-adjoint pairs for the approximate problem satisfy a necessary condition of Pontryagin Minimum Principle type, which facilitates assessment of the optimality of numerical solutions.  相似文献   

15.
This paper deals with a numerical method for solving variable-order fractional optimal control problem with a fractional Bolza cost composed as the aggregate of a standard Mayer cost and a fractional Lagrange cost given by a variable-order Riemann–Liouville fractional integral. Using the integration by part formula and the calculus of variations, the necessary optimality conditions are derived in terms of two-point variable-order boundary value problem. Operational matrices of variable-order right and left Riemann–Liouville integration are derived, and by using them, the two-point boundary value problem is reduced into the system of algebraic equations. Additionally, the convergence analysis of the proposed method has been considered. Moreover, illustrative examples are given to demonstrate the applicability of the proposed method.  相似文献   

16.
旋转曲面变换PSO 算法解非线性最优控制问题   总被引:3,自引:0,他引:3  
针对利用粒子群优化算法进行多极值点函数优化时,存在陷入局部极小点和搜寻效率低的问题.提出旋转曲面变换方法,将被优化函数映射到一个同胚曲面上.它将当前局部极小点变换为全局最大点,并保持被优化函数值在当前局部极小点以下部分的形状不变,从而克服陷入局部极小点的问题.最后将其用于解一个非线性系统的最优控制问题,实验结果证明了该方法的可行性和有效性.  相似文献   

17.
基于对偶变量变分原理提出了求解非线性动力学系统最优控制问题的一种保辛数值方法.以时间区段一端状态和另一端协态作为混合独立变量,在时间区段内采用拉格朗日插值近似状态变量与协态变量,然后利用对偶变量变分原理并将非线性最优控制问题转化为非线性方程组的求解,最终得到求解非线性动力学系统最优控制问题的保辛数值方法.数值实验验证了本文算法在求解精度与求解效率上的有效性.  相似文献   

18.
The theory of nonlinear H of optimal control for affine nonlinear systems is extended to the more general context of singular H optimal control of nonlinear systems using ideas from the linear H theory. Our approach yields under certain assumptions a necessary and sufficient condition for solvability of the state feedback singular H control problem. The resulting state feedback is then used to construct a dynamic compensator solving the nonlinear output feedback H control problem by applying the certainty equivalence principle.  相似文献   

19.
The synthesis of an optimal control function for deterministic systems described by integrodifferential equations is investigated. By using the elegant operational properties of shifted Legendre polynomials, a direct computational algorithm for evaluating the optimal control and trajectory of deterministic systems is developed. An example is given to illustrate the utility of this method.  相似文献   

20.
In this study, to solve fractional problems with non-smooth solutions (which include some terms in the form of piecewise or fractional powers), a new category of basis functions called the orthonormal piecewise fractional Legendre functions is introduced. The upper bound of the error of the series expansion of these functions is obtained. Two explicit formulas for computing the Riemann–Liouville and Atangana–Baleanu fractional integrals of these functions are derived. A direct method based on these functions and their fractional integral is proposed to solve a family of optimal control problems involving the ABC fractional differentiation whose solutions are non-smooth in the above expressed forms. By the proposed technique, solving the original fractional problem turns into solving an equivalent system of algebraic equations. The established method accuracy is studied by solving some examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号