共查询到20条相似文献,搜索用时 15 毫秒
1.
A second-order algorithm is presented for the solution of continuous-time nonlinear optimal control problems. The algorithm is an adaptation of the trust region modifications of Newton's method and solves at each iteration a linear-quadratic control problem with an additional constraint. Under some assumptions, the proposed algorithm is shown to possess a global convergence property. A numerical example is presented to illustrate the method. 相似文献
2.
《Automatica》2014,50(12):3281-3290
This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique. It is known that the nonlinear optimal control problem relies on the solution of the Hamilton–Jacobi–Bellman (HJB) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, most practical systems are too complicated to establish an accurate mathematical model. To overcome these difficulties, we propose a data-based approximate policy iteration (API) method by using real system data rather than a system model. Firstly, a model-free policy iteration algorithm is derived and its convergence is proved. The implementation of the algorithm is based on the actor–critic structure, where actor and critic neural networks (NNs) are employed to approximate the control policy and cost function, respectively. To update the weights of actor and critic NNs, a least-square approach is developed based on the method of weighted residuals. The data-based API is an off-policy RL method, where the “exploration” is improved by arbitrarily sampling data on the state and input domain. Finally, we test the data-based API control design method on a simple nonlinear system, and further apply it to a rotational/translational actuator system. The simulation results demonstrate the effectiveness of the proposed method. 相似文献
3.
This paper extends the application of shifted Legendre polynomial expansion to time-varying systems. The extension is achieved through representing the product of two shifted Legendre series in a new shifted Legendre series. With this treatment of the product of two time functions, the operational properties of the shifted Legendre polynomials are fully applied to the analysis and optimal control of time-varying linear systems with quadratic performance index. 相似文献
4.
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming algorithm with a dual neural network (NN) approximation structure. First, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is employed to approximate the optimal value function. Then, the optimal control law is computed based on the information from the identifier NN and the critic NN, so that the actor NN is not needed. In particular, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simultaneously, which converge to small neighbourhoods around their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods. 相似文献
5.
This paper develops an online algorithm based on policy iteration for optimal control with infinite horizon cost for continuous-time nonlinear systems. In the present method, a discounted value function is employed, which is considered to be a more general case for optimal control problems. Meanwhile, without knowledge of the internal system dynamics, the algorithm can converge uniformly online to the optimal control, which is the solution of the modified Hamilton–Jacobi–Bellman equation. By means of two neural networks, the algorithm is able to find suitable approximations of both the optimal control and the optimal cost. The uniform convergence to the optimal control is shown, guaranteeing the stability of the nonlinear system. A simulation example is provided to illustrate the effectiveness and applicability of the present approach. 相似文献
6.
A method is proposed to determine the optimal feedback control law of a class of nonlinear optimal control problems. The method is based on two steps. The first step is to determine the open-hop optimal control and trajectories, by using the quasilinearization and the state variables parametrization via Chebyshev polynomials of the first type. Therefore the nonlinear optimal control problem is replaced by a sequence of small quadratic programming problems which can easily be solved. The second step is to use the results of the last quasilinearization iteration, when an acceptable convergence error is achieved, to obtain the optimal feedback control law. To this end, the matrix Riccati equation and another n linear differential equations are solved using the Chebyshev polynomials of the first type. Moreover, the differentiation operational matrix of Chebyshev polynomials is introduced. To show the effectiveness of the proposed method, the simulation results of a nonlinear optimal control problem are shown. 相似文献
7.
Joachim Deutscher Author Vitae 《Automatica》2005,41(2):299-304
This contribution presents a new approach for the numeric computation of the input-output linearizing feedback law, which is obtained exactly in an analytical form. By using a state space embedding technique the nonlinear system to be controlled is described by a higher order system with solely polynomial nonlinearities. Consequently, the nonlinearities of this system can be represented by multivariable Legendre polynomials, so that the derivation of the input-output linearizing feedback controller can be accomplished using the operational matrices of multiplication and of differentiation for Legendre polynomials. 相似文献
8.
The transformation into discrete-time equivalents of digital optimal control problems, involving continuous-time linear systems with white stochastic parameters, and quadratic integral criteria, is considered. The system parameters have time-varying statistics. The observations available at the sampling instants are in general nonlinear and corrupted by discrete-time noise. The equivalent discrete-time system has white stochastic parameters. Expressions are derived for the first and second moment of these parameters and for the parameters of the equivalent discrete-time sum criterion, which are explicit in the parameters and statistics of the original digital optimal control problem. A numerical algorithm to compute these expressions is presented. For each sampling interval, the algorithm computes the expressions recursively, forward in time, using successive equidistant evaluations of the matrices which determine the original digital optimal control problem. The algorithm is illustrated with three examples. If the observations at the sampling instants are linear and corrupted by multiplicative and/or additive discrete-time white noise, then, using recent results, full and reduced-order controllers that solve the equivalent discrete-time optimal control problem can be computed. 相似文献
9.
Mazen Alamir 《Automatica》2006,42(9):1593-1598
In this paper, a benchmark problem is proposed in order to assess comparisons between different optimal control problem solvers for hybrid nonlinear systems. The model is nonlinear with 20 states, 4 continuous controls, 1 discrete binary control and 4 configurations. Transitions between configurations lead to state jumps. The system is inspired by the simulated moving bed, a counter-current separation process. 相似文献
10.
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem 总被引:5,自引:0,他引:5
In this paper we discuss an online algorithm based on policy iteration for learning the continuous-time (CT) optimal control solution with infinite horizon cost for nonlinear systems with known dynamics. That is, the algorithm learns online in real-time the solution to the optimal control design HJ equation. This method finds in real-time suitable approximations of both the optimal cost and the optimal control policy, while also guaranteeing closed-loop stability. We present an online adaptive algorithm implemented as an actor/critic structure which involves simultaneous continuous-time adaptation of both actor and critic neural networks. We call this ‘synchronous’ policy iteration. A persistence of excitation condition is shown to guarantee convergence of the critic to the actual optimal value function. Novel tuning algorithms are given for both critic and actor networks, with extra nonstandard terms in the actor tuning law being required to guarantee closed-loop dynamical stability. The convergence to the optimal controller is proven, and the stability of the system is also guaranteed. Simulation examples show the effectiveness of the new algorithm. 相似文献
11.
The theory of nonlinear H∞ of optimal control for affine nonlinear systems is extended to the more general context of singular H∞ optimal control of nonlinear systems using ideas from the linear H∞ theory. Our approach yields under certain assumptions a necessary and sufficient condition for solvability of the state feedback singular H∞ control problem. The resulting state feedback is then used to construct a dynamic compensator solving the nonlinear output feedback H∞ control problem by applying the certainty equivalence principle. 相似文献
12.
Dong-Her Shih Fan-Chu Kung 《Automatic Control, IEEE Transactions on》1986,31(5):451-454
The synthesis of an optimal control function for deterministic systems described by integrodifferential equations is investigated. By using the elegant operational properties of shifted Legendre polynomials, a direct computational algorithm for evaluating the optimal control and trajectory of deterministic systems is developed. An example is given to illustrate the utility of this method. 相似文献
13.
In this paper, a robust model predictive control (MPC) is designed for a class of constrained continuous-time nonlinear systems with bounded additive disturbances. The robust MPC consists of a nonlinear feedback control and a continuous-time model-based dual-mode MPC. The nonlinear feedback control guarantees the actual trajectory being contained in a tube centred at the nominal trajectory. The dual-mode MPC is designed to ensure asymptotic convergence of the nominal trajectory to zero. This paper extends current results on discrete-time model-based tube MPC and linear system model-based tube MPC to continuous-time nonlinear model-based tube MPC. The feasibility and robustness of the proposed robust MPC have been demonstrated by theoretical analysis and applications to a cart-damper springer system and a one-link robot manipulator. 相似文献
14.
Zhen Shao 《International journal of systems science》2019,50(5):1028-1038
In this paper, an adaptive iterative learning control (ILC) method is proposed for switched nonlinear continuous-time systems with time-varying parametric uncertainties. First, an iterative learning controller is constructed with a state feedback term in the time domain and an adaptive learning term in the iteration domain. Then a switched nonlinear continuous-discrete two-dimensional (2D) system is built to describe the adaptive ILC system. Multiple 2D Lyapunov functions-based analysis ensures that the 2D system is exponentially stable, and the tracking error will converge to zero in the iteration domain. The design method of the iterative learning controller is obtained by solving a linear matrix inequality. Finally, the efficacy of the proposed controller is demonstrated by the simulation results. 相似文献
15.
This paper deals with the design of output feedback control to achieve asymptotic tracking and disturbance rejection for a class of nonlinear systems when the exogenous signals are generated by a known linear exosystem. The system under consideration is single-input single-output, input-output linearizable, minimum phase, and modelled by an input-output model of the form of an nth-order differential equation. We assume that, at steady state, the nonlinearities of the system can only introduce a finite number of harmonics of the original exosystem modes. This assumption enables us to identify a linear servo-compensator which is augmented with the original system. Moreover, we augment a series of m integrators at the input side, where m is the highest derivative of the input, and then represent the augmented system by a state model. The augmented system is stabilized via a separation approach in which a robust state feedback controller is designed first to ensure boundedness of all state variables and tracking error convergence; then, a high gain observer and control saturation are used to recover the asymptotic properties achieved under state feedback. 相似文献
16.
《国际计算机数学杂志》2012,89(14):2988-3011
An optimal control problem governed by the first bi-harmonic equation with the integral constraint for the state and its spectral approximations based on a mixed formulation are investigated. The optimality conditions of the exact and the discrete optimal control systems are derived. The a priori error estimates of high order spectral accuracy are obtained. Furthermore, a simple and efficient iterative algorithm is proposed to solve mixed discrete system. Some numerical examples are performed to verify the theoretical results. 相似文献
17.
Online learning is an important property of adaptive dynamic programming (ADP). Online observations contain plentiful dynamics information, and ADP algorithms can utilize them to learn the optimal control policy. This paper reviews the research of online ADP algorithms for the optimal control of continuous-time systems. With the intensive study, ADP has been developed towards model free and data efficient. After separately introducing the algorithms, we compare their performance on the same problem. This paper is desired to provide a comprehensive understanding of continuous-time online ADP algorithms. 相似文献
18.
The regulator problem is studied for linear continuous-time systems with nonsymmetrical constrained control. Necessary and sufficient conditions allowing the largest nonsymmetrical polyhedral positively invariant domain w.r.t. the system in the closed loop to be obtained are given. The case of symmetrical constrained control is obtained as a particular case 相似文献
19.
20.
Adaptive optimal control for continuous-time linear systems based on policy iteration 总被引:5,自引:0,他引:5
In this paper we propose a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics. In other words, the algorithm solves online an algebraic Riccati equation without knowing the internal dynamics model of the system. Being based on a policy iteration technique, the algorithm alternates between the policy evaluation and policy update steps until an update of the control policy will no longer improve the system performance. The result is a direct adaptive control algorithm which converges to the optimal control solution without using an explicit, a priori obtained, model of the system internal dynamics. The effectiveness of the algorithm is shown while finding the optimal-load-frequency controller for a power system. 相似文献