期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints. 总被引：3，自引：0，他引：3

Pingan He S Jagannathan 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2007,37(2):425-436

A novel adaptive-critic-based neural network (NN) controller in discrete time is designed to deliver a desired tracking performance for a class of nonlinear systems in the presence of actuator constraints. The constraints of the actuator are treated in the controller design as the saturation nonlinearity. The adaptive critic NN controller architecture based on state feedback includes two NNs: the critic NN is used to approximate the "strategic" utility function, whereas the action NN is employed to minimize both the strategic utility function and the unknown nonlinear dynamic estimation errors. The critic and action NN weight updates are derived by minimizing certain quadratic performance indexes. Using the Lyapunov approach and with novel weight updates, the uniformly ultimate boundedness of the closed-loop tracking error and weight estimates is shown in the presence of NN approximation errors and bounded unknown disturbances. The proposed NN controller works in the presence of multiple nonlinearities, unlike other schemes that normally approximate one nonlinearity. Moreover, the adaptive critic NN controller does not require an explicit offline training phase, and the NN weights can be initialized at zero or random. Simulation results justify the theoretical analysis. 相似文献

2.

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks 总被引：1，自引：0，他引：1

Qinmin Yang Vance J.B. Jagannathan S. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2008,38(4):994-1001

A nonaffine discrete-time system represented by the nonlinear autoregressive moving average with eXogenous input (NARMAX) representation with unknown nonlinear system dynamics is considered. An equivalent affinelike representation in terms of the tracking error dynamics is first obtained from the original nonaffine nonlinear discrete-time system so that reinforcement-learning-based near-optimal neural network (NN) controller can be developed. The control scheme consists of two linearly parameterized NNs. One NN is designated as the critic NN, which approximates a predefined long-term cost function, and an action NN is employed to derive a near-optimal control signal for the system to track a desired trajectory while minimizing the cost function simultaneously. The NN weights are tuned online. By using the standard Lyapunov approach, the stability of the closed-loop system is shown. The net result is a supervised actor-critic NN controller scheme which can be applied to a general nonaffine nonlinear discrete-time system without needing the affinelike representation. Simulation results demonstrate satisfactory performance of the controller. 相似文献

3.

Discrete-time neural network output feedback control of nonlinear discrete-time systems in non-strict form 总被引：1，自引：0，他引：1

J. Vance Author Vitae Author Vitae 《Automatica》2008,44(4):1020-1027

An adaptive neural network (NN)-based output feedback controller is proposed to deliver a desired tracking performance for a class of discrete-time nonlinear systems, which are represented in non-strict feedback form. The NN backstepping approach is utilized to design the adaptive output feedback controller consisting of: (1) an NN observer to estimate the system states and (2) two NNs to generate the virtual and actual control inputs, respectively. The non-causal problem encountered during the control design is overcome by using a dynamic NN which is constructed through a feedforward NN with a novel weight tuning law. The separation principle is relaxed, persistency of excitation condition (PE) is not needed and certainty equivalence principle is not used. The uniformly ultimate boundedness (UUB) of the closed-loop tracking error, the state estimation errors and the NN weight estimates is demonstrated. Though the proposed work is applicable for second order nonlinear discrete-time systems expressed in non-strict feedback form, the proposed controller design can be easily extendable to an nth order nonlinear discrete-time system. 相似文献

4.

Asymptotic tracking by a reinforcement learning-based adaptive critic controller 总被引：1，自引：0，他引：1

Shubhendu BHASIN Nitin SHARMA Parag PATRE Warren DIXON 《控制理论与应用(英文版)》2011,9(3):400-409

Adaptive critic (AC) based controllers are typically discrete and/or yield a uniformly ultimately bounded stability result because of the presence of disturbances and unknown approximation errors. A continuous-time AC controller is developed that yields asymptotic tracking of a class of uncertain nonlinear systems with bounded disturbances. The proposed AC-based controller consists of two neural networks (NNs) – an action NN, also called the actor, which approximates the plant dynamics and generates appropriate control actions; and a critic NN, which evaluates the performance of the actor based on some performance index. The reinforcement signal from the critic is used to develop a composite weight tuning law for the action NN based on Lyapunov stability analysis. A recently developed robust feedback technique, robust integral of the sign of the error (RISE), is used in conjunction with the feedforward action neural network to yield a semiglobal asymptotic result. Experimental results are provided that illustrate the performance of the developed controller. 相似文献

5.

Global Solution for the Optimal Feedback Control of the Underactuated Heisenberg System

《Automatic Control, IEEE Transactions on》2008,53(11):2638-2642

We present a global solution for an optimal feedback controller of the underactuated Heisenberg system or nonholonomic integrator. Employing a recently developed technique based on generating functions appearing in the Hamilton-Jacobi theory, we circumvent a singularity caused by underactuation to develop a nonlinear optimal feedback control in an implicitly analytical form. The systematic procedure to deal with underactuation indicates that generating functions should be effective tools for solving general underactuated optimal control problems. 相似文献

6.

An Adaptive Borrow-and-Return Model for Broadcasting Videos

《Multimedia, IEEE Transactions on》2009,11(4):707-715

Yang proposed the concept of borrow-and-return (BR) to leverage the unused server bandwidth when a group of popular videos being broadcast with the FSFC (first segment on the first channel) broadcasting schemes in order to improve the mean waiting time (MWT) of the viewers with the help of additional receiving bandwidth available at the high-end clients. The BR model borrows the bandwidth of the videos with no new-coming viewers during a timeslot to speed up the transmission of the first segments of some of the remaining videos. In this paper, we first address the relative advantage issue among various possible BR schemes by developing a parametric generic BR (GBR) scheme controlled externally by independent borrow parameters. Later, we propose a new BR (NBR) model by incorporating an efficient transmission strategy to reduce the MWT further. Finally, an optimal NBR scheme is developed by augmenting with the optimal borrow parameters, which significantly outperforms the existing and new BR schemes in terms of overall MWT. 相似文献

7.

A Discrete-Time Robust Extended Kalman Filter for Uncertain Systems With Sum Quadratic Constraints

《Automatic Control, IEEE Transactions on》2009,54(4):850-854

This technical note outlines the formulation of a novel discrete-time robust extended Kalman filter for uncertain systems with uncertainties described in terms of Sum Quadratic Constraints. The robust filter is an approximate set-valued state estimator which is robust in the sense that it can handle modeling uncertainties in addition to exogenous noise. Riccati and filter difference equations are obtained as an approximate solution to a reverse-time optimal control problem defining the set-valued state estimator. In order to obtain a solution to the set-valued state estimation problem, the discrete-time system dynamics are modeled backwards in time. 相似文献

8.

Optimization of Power Allocation for Interference Cancellation With Particle Swarm Optimization 总被引：1，自引：0，他引：1

《Evolutionary Computation, IEEE Transactions on》2009,13(1):128-150

In code division multiple access (CDMA) systems a significant degradation in detection performance due to multiuser interference can be avoided by the utilization of interference cancellation methods. Further enhancement can be obtained by optimizing the power allocation of the users. The resulting constrained single-objective optimization problem is solved here by means of particle swarm optimization (PSO). It is shown that the maximum number of users for a CDMA system can be increased significantly if an optimized power profile is employed. Furthermore, an extensive study of PSO control parameter settings using three different neighborhood topologies is performed on the basis of the power allocation problem, and two constraint-handling techniques are evaluated. Results from the parameter study are compared with examinations from the literature. It is shown that the von-Neumann neighborhood topology performs consistently better than gbest and lbest. However, strong interaction effects and conflicting recommendations for parameter settings are found that emphasize the need for adaptive approaches. 相似文献

9.

gripper

Jagannathan S. Galan G. 《Neural Networks, IEEE Transactions on》2004,15(2):395-407

Grasping of objects has been a challenging task for robots. The complex grasping task can be defined as object contact control and manipulation subtasks. In this paper, object contact control subtask is defined as the ability to follow a trajectory accurately by the fingers of a gripper. The object manipulation subtask is defined in terms of maintaining a predefined applied force by the fingers on the object. A sophisticated controller is necessary since the process of grasping an object without a priori knowledge of the object's size, texture, softness, gripper, and contact dynamics is rather difficult. Moreover, the object has to be secured accurately and considerably fast without damaging it. Since the gripper, contact dynamics, and the object properties are not typically known beforehand, an adaptive critic neural network (NN)-based hybrid position/force control scheme is introduced. The feedforward action generating NN in the adaptive critic NN controller compensates the nonlinear gripper and contact dynamics. The learning of the action generating NN is performed on-line based on a critic NN output signal. The controller ensures that a three-finger gripper tracks a desired trajectory while applying desired forces on the object for manipulation. Novel NN weight tuning updates are derived for the action generating and critic NNs so that Lyapunov-based stability analysis can be shown. Simulation results demonstrate that the proposed scheme successfully allows fingers of a gripper to secure objects without the knowledge of the underlying gripper and contact dynamics of the object compared to conventional schemes. 相似文献

10.

Reinforcement learning-based output feedback control of nonlinear systems with input constraints.

P He S Jagannathan 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2005,35(1):150-154

A novel neural network (NN)-based output feedback controller with magnitude constraints is designed to deliver a desired tracking performance for a class of multi-input and multi-output (MIMO) strict feedback nonlinear discrete-time systems. Reinforcement learning is proposed for the output feedback controller, which uses three NNs: 1) an NN observer to estimate the system states with the input-output data, 2) a critic NN to approximate certain strategic utility function, and 3) an action NN to minimize both the strategic utility function and the unknown dynamics estimation errors. Using the Lyapunov approach, the uniformly ultimate boundedness (UUB) of the state estimation errors, the tracking errors and weight estimates is shown. 相似文献

11.

Robust tracking control for uncertain Euler–Lagrange systems via dynamic event-triggered and self-triggered ADP

Lu Chen Fei Hao 《国际强度与非线性控制杂志
》2024,34(1):481-505

This article investigates the robust tracking control problem for a class of Euler–Lagrange systems in presence of parameter uncertainties and external disturbances. Through system transformation and theoretical analysis, an adaptive dynamic programming (ADP) algorithm with two adaptive neural networks (NNs) and a suitable triggering mechanism is proposed to attain the robust stability of the closed-loop system. A single critic NN is leveraged to implement the approximate optimal controller design. Particularly, an NN-based feedforward compensation is developed to cope with the uncertainties with unknown bounds. Two different triggering mechanisms are respectively constructed to reduce the budget of sampling, communication and computation, namely the dynamic event-triggering mechanism (DETM) and the self-triggering mechanism (STM). The DETM is utilized to decide the update of remote controller and critic NN weight, which can yield a larger inter-event interval than the static event-triggering mechanism. Also, the Zeno-free behavior is guaranteed. Moreover, it is a novel attempt to introduce the STM into ADP design, which relaxes the demand of dedicated hardware online monitoring the event-triggering condition. Then it is demonstrated that all signals in the closed-loop system are uniformly ultimately bounded (UUB) via Lyapunov-based stability analysis. Finally, a simulation example of 2-link robotic system is implemented to verify the feasibility and effectiveness of the proposed algorithm. 相似文献

12.

New Results on Modal Participation Factors: Revealing a Previously Unknown Dichotomy

《Automatic Control, IEEE Transactions on》2009,54(7):1439-1449

This paper presents a new fundamental approach to modal participation analysis of linear time-invariant systems, leading to new insights and new formulas for modal participation factors. Modal participation factors were introduced over a quarter century ago as a way of measuring the relative participation of modes in states, and of states in modes, for linear time-invariant systems. Participation factors have proved their usefulness in the field of electric power systems and in other applications. However, in the current understanding, it is routinely taken for granted that the measure of participation of modes in states is identical to that for participation of states in modes. Here, a new analysis using averaging over an uncertain set of system initial conditions yields the conclusion that these quantities (participation of modes in states and participation of states in modes) should not be viewed as interchangeable. In fact, it is proposed that a new definition and calculation replace the existing ones for state in mode participation factors, while the previously existing participation factors definition and formula should be retained but viewed only in the sense of mode in state participation factors. Several examples are used to illustrate the issues addressed and the results obtained. 相似文献

13.

Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

下载免费PDF全文

Jing Na Guido Herrmann 《IEEE/CAA Journal of Automatica Sinica》2014,1(4):412-422

This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network (NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman (HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method. 相似文献

14.

Adaptive output feedback control for nonlinear time-delay systems using neural network 总被引：6，自引：0，他引：6

Weisheng CHEN Junmin LI 《控制理论与应用(英文版)》2006,4(4):313-320

This paper extends the adaptive neural network （NN） control approaches to a class of unknown output feedback nonlinear time-delay systems. An adaptive output feedback NN tracking controller is designed by backstepping technique. NNs are used to approximate unknown functions dependent on time delay, Delay-dependent filters are introduced for state estimation. The domination method is used to deal with the smooth time-delay basis functions. The adaptive bounding technique is employed to estimate the upper bound of the NN approximation errors. Based on Lyapunov- Krasovskii functional, the semi-global uniform ultimate boundedness of all the signals in the closed-loop system is proved, The feasibility is investigated by two illustrative simulation examples. 相似文献

15.

Robust Image Corner Detection Based on the Chord-to-Point Distance Accumulation Technique 总被引：2，自引：0，他引：2

《Multimedia, IEEE Transactions on》2008,10(6):1059-1072

Many contour-based image corner detectors are based on the curvature scale-space (CSS). We identify the weaknesses of the CSS-based detectors. First, the “curvature” itself by its “definition” is very much sensitive to the local variation and noise on the curve, unless an appropriate smoothing is carried out beforehand. In addition, the calculation of curvature involves derivatives of up to second order, which may cause instability and errors in the result. Second, the Gaussian smoothing causes changes to the curve and it is difficult to select an appropriate smoothing-scale, resulting in poor performance of the CSS corner detection technique. We propose a complete corner detection technique based on the chord-to-point distance accumulation (CPDA) for the discrete curvature estimation. The CPDA discrete curvature estimation technique is less sensitive to the local variation and noise on the curve. Moreover, it does not have the undesirable effect of the Gaussian smoothing. We provide a comprehensive performance study. Our experiments showed that the proposed technique performs better than the existing CSS-based and other related methods in terms of both average repeatability and localization error. 相似文献

16.

Energetic and Informational Masking Effects in an Audiovisual Speech Recognition System

《IEEE transactions on audio, speech, and language processing》2009,17(3):446-458

The paper presents a robust audiovisual speech recognition technique called audiovisual speech fragment decoding. The technique addresses the challenge of recognizing speech in the presence of competing nonstationary noise sources. It employs two stages. First, an acoustic analysis decomposes the acoustic signal into a number of spectro–temporall fragments. Second, audiovisual speech models are used to select fragments belonging to the target speech source. The approach is evaluated on a small vocabulary simultaneous speech recognition task in conditions that promote two contrasting types of masking: energetic masking caused by the energy of the masker utterance swamping that of the target, and informational masking, caused by similarity between the target and masker making it difficult to selectively attend to the correct source. Results show that the system is able to use the visual cues to reduce the effects of both types of masking. Further, whereas recovery from energetic masking may require detailed visual information (i.e., sufficient to carry phonetic content), release from informational masking can be achieved using very crude visual representations that encode little more than the timing of mouth opening and closure. 相似文献

17.

Adaptive NN control for a class of discrete-time non-linear systems

S. S. Ge T. H. Lee G. Y. Li J. Zhang 《International journal of control》2013,86(4):334-354

相似文献

18.

Adaptive NN control for a class of strict-feedback discrete-time nonlinear systems

S.S. Ge G.Y. Li T.H. Lee 《Automatica》2003,39(5):807-819

In this paper, both full state and output feedback adaptive neural network (NN) controllers are presented for a class of strict-feedback discrete-time nonlinear systems. Firstly, Lyapunov-based full-state adaptive NN control is presented via backstepping, which avoids the possible controller singularity problem in adaptive nonlinear control and solves the noncausal problem in the discrete-time backstepping design procedure. After the strict-feedback form is transformed into a cascade form, another relatively simple Lyapunov-based direct output feedback control is developed. The closed-loop systems for both control schemes are proven to be semi-globally uniformly ultimately bounded. 相似文献

19.

An ILC-Based Adaptive Control for General Stochastic Systems With Strictly Decreasing Entropy

《Neural Networks, IEEE Transactions on》2009,20(3):471-482

In this paper, a new method for adaptive control of general nonlinear and non-Gaussian unknown stochastic systems has been proposed. The method applies the minimum entropy control scheme to decrease the closed-loop randomness of the output under an iterative learning control (ILC) basis. Both modeling and control of the plant are performed using dynamic neural networks. For this purpose, the whole control horizon is divided into a certain number of time domain subintervals called batches and a pseudo-D-type ILC law is employed to train the plant model and controller parameters so that the entropy of the closed-loop tracking error is made to decrease batch by batch. The method has the advantage of decreasing the output uncertainty versus the advances of batches along the time horizon. The analysis on the proposed ILC convergence is made and a set of demonstrable experiment results is also provided to show the effectiveness of the obtained control algorithm, where encouraging results have been obtained. 相似文献

20.

Text-Like Segmentation of General Audio for Content-Based Retrieval

《Multimedia, IEEE Transactions on》2009,11(4):658-669

Automatic detection of (semantically) meaningful audio segments, or audio scenes, is an important step in high-level semantic inference from general audio signals, and can benefit various content-based applications involving both audio and multimodal (multimedia) data sets. Motivated by the known limitations of traditional low-level feature-based approaches, we propose in this paper a novel approach to discover audio scenes, based on an analysis of audio elements and key audio elements, which can be seen as equivalents to the words and keywords in a text document, respectively. In the proposed approach, an audio track is seen as a sequence of audio elements, and the presence of an audio scene boundary at a given time stamp is checked based on pair-wise measuring the semantic affinity between different parts of the analyzed audio stream surrounding that time stamp. Our proposed model for semantic affinity exploits the proven concepts from text document analysis, and is introduced here as a function of the distance between the audio parts considered, and the co-occurrence statistics and the importance weights of the audio elements contained therein. Experimental evaluation performed on a representative data set consisting of 5 h of diverse audio data streams indicated that the proposed approach is more effective than the traditional low-level feature-based approaches in solving the posed audio scene segmentation problem. 相似文献