期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Observer-based Adaptive Optimal Control for Unknown Singularly Perturbed Nonlinear Systems With Input Constraints

Zhijun Fu Wenfang Xie Subhash Rakheja Jing Na 《IEEE/CAA Journal of Automatica Sinica》2017,4(1):48-57

This paper introduces an observer-based adaptive optimal control method for unknown singularly perturbed nonlinear systems with input constraints. First, a multi-time scales dynamic neural network (MTSDNN) observer with a novel updating law derived from a properly designed Lyapunov function is proposed to estimate the system states. Then, an adaptive learning rule driven by the critic NN weight error is presented for the critic NN, which is used to approximate the optimal cost function. Finally, the optimal control action is calculated by online solving the Hamilton-Jacobi-Bellman (HJB) equation associated with the MTSDNN observer and critic NN. The stability of the overall closed-loop system consisting of the MTSDNN observer, the critic NN and the optimal control action is proved. The proposed observer-based optimal control approach has an essential advantage that the system dynamics are not needed for implementation, and only the measured input/output data is needed. Moreover, the proposed optimal control design takes the input constraints into consideration and thus can overcome the restriction of actuator saturation. Simulation results are presented to confirm the validity of the investigated approach. 相似文献

2.

Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem 总被引：5，自引：0，他引：5

Kyriakos G. Vamvoudakis^{Author Vitae} Frank L. Lewis Author Vitae 《Automatica》2010,46(5):878-22

In this paper we discuss an online algorithm based on policy iteration for learning the continuous-time (CT) optimal control solution with infinite horizon cost for nonlinear systems with known dynamics. That is, the algorithm learns online in real-time the solution to the optimal control design HJ equation. This method finds in real-time suitable approximations of both the optimal cost and the optimal control policy, while also guaranteeing closed-loop stability. We present an online adaptive algorithm implemented as an actor/critic structure which involves simultaneous continuous-time adaptation of both actor and critic neural networks. We call this ‘synchronous’ policy iteration. A persistence of excitation condition is shown to guarantee convergence of the critic to the actual optimal value function. Novel tuning algorithms are given for both critic and actor networks, with extra nonstandard terms in the actor tuning law being required to guarantee closed-loop dynamical stability. The convergence to the optimal controller is proven, and the stability of the system is also guaranteed. Simulation examples show the effectiveness of the new algorithm. 相似文献

3.

Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics

Yongfeng Lv Qinmin Yang Xing Wu Yu Guo 《International journal of control》2016,89(1):99-112

An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming algorithm with a dual neural network (NN) approximation structure. First, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is employed to approximate the optimal value function. Then, the optimal control law is computed based on the information from the identifier NN and the critic NN, so that the actor NN is not needed. In particular, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simultaneously, which converge to small neighbourhoods around their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods. 相似文献

4.

Asymptotic tracking by a reinforcement learning-based adaptive critic controller 总被引：1，自引：0，他引：1

Shubhendu BHASIN Nitin SHARMA Parag PATRE Warren DIXON 《控制理论与应用(英文版)》2011,9(3):400-409

Adaptive critic (AC) based controllers are typically discrete and/or yield a uniformly ultimately bounded stability result because of the presence of disturbances and unknown approximation errors. A continuous-time AC controller is developed that yields asymptotic tracking of a class of uncertain nonlinear systems with bounded disturbances. The proposed AC-based controller consists of two neural networks (NNs) – an action NN, also called the actor, which approximates the plant dynamics and generates appropriate control actions; and a critic NN, which evaluates the performance of the actor based on some performance index. The reinforcement signal from the critic is used to develop a composite weight tuning law for the action NN based on Lyapunov stability analysis. A recently developed robust feedback technique, robust integral of the sign of the error (RISE), is used in conjunction with the feedforward action neural network to yield a semiglobal asymptotic result. Experimental results are provided that illustrate the performance of the developed controller. 相似文献

5.

含未知信息的轮式移动机器人编队确定学习控制

彭滔刘成军《控制理论与应用》2018,35(2):239-247

本文研究含未知信息的轮式移动机器人(wheeled mobile robots,WMR)的编队控制问题.首先,基于领航–跟随法和虚拟结构法,将WMR编队控制问题转化为跟随机器人对参考虚拟机器人的跟踪控制问题.然后,利用径向基函数神经网络(radial basis function neural networks,RBF NN)对WMR的未知系统动态进行学习,以及根据李雅普诺夫稳定性理论设计了稳定的自适应RBF NN控制器和RBF NN权值估计的学习率.依据确定学习理论,闭环系统内部信号在对回归轨迹实现跟踪控制的过程中满足部分持续激励(persistent excitation,PE)条件.随着PE条件的满足,RBF NN权值估计收敛到其理想权值,实现了对未知闭环系统动态的准确学习.最后,利用学习结果设计了RBF NN学习控制器,保证了控制系统的稳定与收敛,实现了闭环稳定性和改进了控制性能,并通过仿真验证了所提控制方法的正确性和有效性. 相似文献

6.

基于事件触发的离散 MIMO 系统自适应评判容错控制

王敏黄龙旺杨辰光《自动化学报》2022,48(5):1234-1245

本文针对具有执行器故障的一类离散非线性多输入多输出(Multi-input multi-output, MIMO)系统, 提出了一种基于事件触发的自适应评判容错控制方案. 该控制方案包括评价和执行网络. 在评价网络里, 为了缓解现有的非光滑二值效用函数可能引起的执行网络跳变问题, 利用高斯函数构建了一个光滑的效用函数, 并采用评价网络近似最优性能指标函数. 在执行网络里, 通过变量替换将系统状态的将来信息转化成关于系统当前状态的函数, 并结合事件触发机制设计了最优跟踪控制器. 该控制器引入了动态补偿项, 不仅能够抑制执行器故障对系统性能的影响, 而且能够改善系统的控制性能. 稳定性分析表明所有信号最终一致有界且跟踪误差收敛于原点的有界小邻域内. 数值系统和实际系统的仿真结果验证了该方案的有效性. 相似文献

7.

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks 总被引：1，自引：0，他引：1

Qinmin Yang Vance J.B. Jagannathan S. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2008,38(4):994-1001

A nonaffine discrete-time system represented by the nonlinear autoregressive moving average with eXogenous input (NARMAX) representation with unknown nonlinear system dynamics is considered. An equivalent affinelike representation in terms of the tracking error dynamics is first obtained from the original nonaffine nonlinear discrete-time system so that reinforcement-learning-based near-optimal neural network (NN) controller can be developed. The control scheme consists of two linearly parameterized NNs. One NN is designated as the critic NN, which approximates a predefined long-term cost function, and an action NN is employed to derive a near-optimal control signal for the system to track a desired trajectory while minimizing the cost function simultaneously. The NN weights are tuned online. By using the standard Lyapunov approach, the stability of the closed-loop system is shown. The net result is a supervised actor-critic NN controller scheme which can be applied to a general nonaffine nonlinear discrete-time system without needing the affinelike representation. Simulation results demonstrate satisfactory performance of the controller. 相似文献

8.

Event-Triggered Optimal Adaptive Control Algorithm for Continuous-Time Nonlinear Systems

下载免费PDF全文

Kyriakos G. Vamvoudakis 《IEEE/CAA Journal of Automatica Sinica》2014,1(3):282-293

This paper proposes a novel optimal adaptive eventtriggered control algorithm for nonlinear continuous-time systems. The goal is to reduce the controller updates, by sampling the state only when an event is triggered to maintain stability and optimality. The online algorithm is implemented based on an actor/critic neural network structure. A critic neural network is used to approximate the cost and an actor neural network is used to approximate the optimal event-triggered controller. Since in the algorithm proposed there are dynamics that exhibit continuous evolutions described by ordinary differential equations and instantaneous jumps or impulses, we will use an impulsive system approach. A Lyapunov stability proof ensures that the closed-loop system is asymptotically stable. Finally, we illustrate the effectiveness of the proposed solution compared to a timetriggered controller. 相似文献

9.

Distributed event-triggered optimal bipartite consensus control for multiagent systems with input delay via reinforcement learning method

Jing Zhang Hui Ma Yun Zhang Yang Chen 《Asian journal of control》2023,25(6):4834-4852

In this article, an optimal bipartite consensus control (OBCC) scheme is proposed for heterogeneous multiagent systems (MASs) with input delay by reinforcement learning (RL) algorithm. A directed signed graph is established to construct MASs with competitive and cooperative relationships, and model reduction method is developed to tackle input delay problem. Then, based on the Hamilton–Jacobi–Bellman (HJB) equation, policy iteration method is utilized to design the bipartite consensus controller, which consists of value function and optimal controller. Further, a distributed event-triggered function is proposed to increase control efficiency, which only requires information from its own agent and neighboring agents. Based on the input-to-state stability (ISS) function and Lyapunov function, sufficient conditions for the stability of MASs can be derived. Apart from that, RL algorithm is employed to solve the event-triggered OBCC problem in MASs, where critic neural networks (NNs) and actor NNs estimate value function and control policy, respectively. Finally, simulation results are given to validate the feasibility and efficiency of the proposed algorithm. 相似文献

10.

Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design

《Automatica》2014,50(12):3281-3290

This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique. It is known that the nonlinear optimal control problem relies on the solution of the Hamilton–Jacobi–Bellman (HJB) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, most practical systems are too complicated to establish an accurate mathematical model. To overcome these difficulties, we propose a data-based approximate policy iteration (API) method by using real system data rather than a system model. Firstly, a model-free policy iteration algorithm is derived and its convergence is proved. The implementation of the algorithm is based on the actor–critic structure, where actor and critic neural networks (NNs) are employed to approximate the control policy and cost function, respectively. To update the weights of actor and critic NNs, a least-square approach is developed based on the method of weighted residuals. The data-based API is an off-policy RL method, where the “exploration” is improved by arbitrarily sampling data on the state and input domain. Finally, we test the data-based API control design method on a simple nonlinear system, and further apply it to a rotational/translational actuator system. The simulation results demonstrate the effectiveness of the proposed method. 相似文献

11.

Finite-time adaptive neural control for uncertain nonlinear time-delay systems with actuator delay and full-state constraints

Wenjie Si Lin Qi Ning Hou Xunde Dong 《International journal of systems science》2019,50(4):726-738

This paper investigates finite-time adaptive neural tracking control for a class of nonlinear time-delay systems subject to the actuator delay and full-state constraints. The difficulty is to consider full-state time delays and full-state constraints in finite-time control design. First, finite-time control method is used to achieve fast transient performances, and new Lyapunov–Krasovskii functionals are appropriately constructed to compensate time delays, in which a predictor-like term is utilized to transform input delayed systems into delay-free systems. Second, neural networks are utilized to deal with the unknown functions, the Gaussian error function is used to express the continuously differentiable asymmetric saturation nonlinearity, and barrier Lyapunov functions are employed to guarantee that full-state signals are restricted within certain fixed bounds. At last, based on finite-time stability theory and Lyapunov stability theory, the finite-time tracking control question involved in full-state constraints is solved, and the designed control scheme reduces learning parameters. It is shown that the presented neural controller ensures that all closed-loop signals are bounded and the tracking error converges to a small neighbourhood of the origin in a finite time. The simulation studies are provided to further illustrate the effectiveness of the proposed approach. 相似文献

12.

Online adaptive algorithm for optimal control with integral reinforcement learning

Kyriakos G. Vamvoudakis Draguna Vrabie Frank L. Lewis 《国际强度与非线性控制杂志
》2014,24(17):2686-2710

In this paper, we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous‐time optimal control solution for nonlinear systems with infinite horizon costs and partial knowledge of the system dynamics. This algorithm is a data‐based approach to the solution of the Hamilton–Jacobi–Bellman equation, and it does not require explicit knowledge on the system's drift dynamics. A novel adaptive control algorithm is given that is based on policy iteration and implemented using an actor/critic structure having two adaptive approximator structures. Both actor and critic approximation networks are adapted simultaneously. A persistence of excitation condition is required to guarantee convergence of the critic to the actual optimal value function. Novel adaptive control tuning algorithms are given for both critic and actor networks, with extra terms in the actor tuning law being required to guarantee closed loop dynamical stability. The approximate convergence to the optimal controller is proven, and stability of the system is also guaranteed. Simulation examples support the theoretical result. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

13.

Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints. 总被引：3，自引：0，他引：3

Pingan He S Jagannathan 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2007,37(2):425-436

A novel adaptive-critic-based neural network (NN) controller in discrete time is designed to deliver a desired tracking performance for a class of nonlinear systems in the presence of actuator constraints. The constraints of the actuator are treated in the controller design as the saturation nonlinearity. The adaptive critic NN controller architecture based on state feedback includes two NNs: the critic NN is used to approximate the "strategic" utility function, whereas the action NN is employed to minimize both the strategic utility function and the unknown nonlinear dynamic estimation errors. The critic and action NN weight updates are derived by minimizing certain quadratic performance indexes. Using the Lyapunov approach and with novel weight updates, the uniformly ultimate boundedness of the closed-loop tracking error and weight estimates is shown in the presence of NN approximation errors and bounded unknown disturbances. The proposed NN controller works in the presence of multiple nonlinearities, unlike other schemes that normally approximate one nonlinearity. Moreover, the adaptive critic NN controller does not require an explicit offline training phase, and the NN weights can be initialized at zero or random. Simulation results justify the theoretical analysis. 相似文献

14.

不对称约束多人非零和博弈的自适应评判控制

李梦花王鼎乔俊飞《控制理论与应用》2023,40(9):1562-1568

本文针对连续时间非线性系统的不对称约束多人非零和博弈问题, 建立了一种基于神经网络的自适应评判控制方法. 首先, 本文提出了一种新颖的非二次型函数来处理不对称约束问题, 并且推导出最优控制律和耦合Hamilton-Jacobi方程. 值得注意的是, 当系统状态为零时, 最优控制策略是不为零的, 这与以往不同. 然后, 通过构建单一评判网络来近似每个玩家的最优代价函数, 从而获得相关的近似最优控制策略. 同时, 在评判学习期间发展了一种新的权值更新规则. 此外, 通过利用Lyapunov理论证明了评判网络权值近似误差和闭环系统状态的稳定性. 最后, 仿真结果验证了本文所提方法的有效性相似文献

15.

Model-free Adaptive Dynamic Programming Based Near-optimal Decentralized Tracking Control of Reconfigurable Manipulators

Bo Zhao Yuanchun Li 《International Journal of Control, Automation and Systems》2018,16(2):478-490

In this paper, a model-free near-optimal decentralized tracking control (DTC) scheme is developed for reconfigurable manipulators via adaptive dynamic programming algorithm. The proposed controller can be divided into two parts, namely local desired controller and local tracking error controller. In order to remove the normboundedness assumption of interconnections, desired states of coupled subsystems are employed to substitute their actual states. Using the local input/output data, the unknown subsystem dynamics of reconfigurable manipulators can be identified by constructing local neural network (NN) identifiers. With the help of the identified dynamics, the local desired control can be derived directly with corresponding desired states. Then, for tracking error subsystems, the local tracking error control is investigated by the approximate improved local cost function via local critic NN and the identified input gain matrix. To overcome the overall error caused by the substitution, identification and critic NN approximation, a robust compensation is added to construct the improved local cost function that reflects the overall error, regulation and control simultaneously. Therefore, the closed-loop tracking system can be guaranteed to be asymptotically stable via Lyapunov stability theorem. Two 2-degree of freedom reconfigurable manipulators with different configurations are employed to demonstrate the effectiveness of the proposed modelfree near-optimal DTC scheme. 相似文献

16.

Parallel Control for Optimal Tracking via Adaptive Dynamic Programming

下载免费PDF全文

Jingwei Lu Qinglai Wei Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》2020,7(6):1662-1674

This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems. Unlike existing optimal state feedback control, the control input of the optimal parallel control is introduced into the feedback system. However, due to the introduction of control input into the feedback system, the optimal state feedback control methods can not be applied directly. To address this problem, an augmented system and an augmented performance index function are proposed firstly. Thus, the general nonlinear system is transformed into an affine nonlinear system. The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically. It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function. Moreover, an adaptive dynamic programming (ADP) technique is utilized to implement the optimal parallel tracking control using a critic neural network (NN) to approximate the value function online. The stability analysis of the closed-loop system is performed using the Lyapunov theory, and the tracking error and NN weights errors are uniformly ultimately bounded (UUB). Also, the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals. Finally, the effectiveness of the developed optimal parallel control method is verified in two cases. 相似文献

17.

Optimized tracking control using reinforcement learning strategy for a class of nonlinear systems

Xue Yang Bin Li 《Asian journal of control》2023,25(3):2095-2104

This paper is to develop a simplified optimized tracking control using reinforcement learning (RL) strategy for a class of nonlinear systems. Since the nonlinear control gain function is considered in the system modeling, it is challenging to extend the existing RL-based optimal methods to the tracking control. The main reasons are that these methods' algorithm are very complex; meanwhile, they also require to meet some strict conditions. Different with these exiting RL-based optimal methods that derive the actor and critic training laws from the square of Bellman residual error, which is a complex function consisting of multiple nonlinear terms, the proposed optimized scheme derives the two RL training laws from negative gradient of a simple positive function, so that the algorithm can be significantly simplified. Moreover, the actor and critic in RL are constructed by employing neural network (NN) to approximate the solution of Hamilton–Jacobi–Bellman (HJB) equation. Finally, the feasibility of the proposed method is demonstrated in accordance with both Lyapunov stability theory and simulation example. 相似文献

18.

Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

下载免费PDF全文

Jing Na Guido Herrmann 《IEEE/CAA Journal of Automatica Sinica》2014,1(4):412-422

This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network (NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman (HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method. 相似文献

19.

An online fault tolerant actor-critic neuro-control for a class of nonlinear systems using neural network HJB approach

Seung Jin Chang Jae Young Lee Jin Bae Park Yoon Ho Choi 《International Journal of Control, Automation and Systems》2015,13(2):311-318

In this paper, we propose an actor-critic neuro-control for a class of continuous-time nonlinear systems under nonlinear abrupt faults, which is combined with an adaptive fault diagnosis observer (AFDO). Together with its estimation laws, an AFDO scheme, which estimates the faults in real time, is designed based on Lyapunov analysis. Then, based on the designed AFDO, a fault tolerant actor- critic control scheme is proposed where the critic neural network (NN) is used to approximate the value function and the actor NN updates the fault tolerant policy based on the approximated value function in the critic NN. The weight update laws for critic NN and actor NN are designed using the gradient descent method. By Lyapunov analysis, we prove the uniform ultimately boundedness (UUB) of all the states, their estimation errors, and NN weights of the fault tolerant system under the unpredictable faults. Finally, we verify the effectiveness of the proposed method through numerical simulations. 相似文献

20.

Adaptive critic motion control design of autonomous wheeled mobile robot by dual heuristic programming

Wei-Song Lin^{Author Vitae} Ping-Chieh Yang Author Vitae 《Automatica》2008,44(11):2716-2723

Autonomous wheeled mobile robot (WMR) needs implementing velocity and path tracking control subject to complex dynamical constraints. Conventionally, this control design is obtained by analysis and synthesis or by domain expert to build control rules. This paper presents an adaptive critic motion control design, which enables WMR to autonomously generate the control ability by learning through trials. The design consists of an adaptive critic velocity control loop and a self-learning posture control loop. The neural networks in the velocity neuro-controller (VNC) are corrected with the dual heuristic programming (DHP) adaptive critic method. Designer simply expresses the control objective by specifying the primary utility function then VNC will attempt to fulfill it through incremental optimization. The posture neuro-controller (PNC) learns by approximating the specialized inverse velocity model of WMR so as to map planned positions to suitable velocity commands. Supervised drive supplies variant velocity commands for PNC and VNC to set up their neural weights. During autonomous drive, while PNC halts learning VNC keeps on correcting its neural weights to optimize the control performance. The proposed design is evaluated on an experimental WMR. The results show that the DHP adaptive critic design is a useful base of autonomous control. 相似文献