首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
Online adaptive optimal control methods based on reinforcement learning algorithms typically need to check for the persistence of excitation condition, which is necessary to be known a priori for convergence of the algorithm. However, this condition is often infeasible to implement or monitor online. This paper proposes an online concurrent reinforcement learning algorithm (CRLA) based on neural networks (NNs) to solve the H control problem of partially unknown continuous‐time systems, in which the need for persistence of excitation condition is relaxed by using the idea of concurrent learning. First, H control problem is formulated as a two‐player zero‐sum game, and then, online CRLA is employed to obtain the approximation of the optimal value and the Nash equilibrium of the game. The proposed algorithm is implemented on actor–critic–disturbance NN approximator structure to obtain the solution of the Hamilton–Jacobi–Isaacs equation online forward in time. During the implementation of the algorithm, the control input that acts as one player attempts to make the optimal control while the other player, that is, disturbance, tries to make the worst‐case possible disturbance. Novel update laws are derived for adaptation of the critic and actor NN weights. The stability of the closed‐loop system is guaranteed using Lyapunov technique, and the convergence to the Nash solution of the game is obtained. Simulation results show the effectiveness of the proposed method. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
This paper presents an online learning algorithm based on integral reinforcement learning (IRL) to design an output‐feedback (OPFB) H tracking controller for partially unknown linear continuous‐time systems. Although reinforcement learning techniques have been successfully applied to find optimal state‐feedback controllers, in most control applications, it is not practical to measure the full system states. Therefore, it is desired to design OPFB controllers. To this end, a general bounded L2 ‐gain tracking problem with a discounted performance function is used for the OPFB H tracking. A tracking game algebraic Riccati equation is then developed that gives a Nash equilibrium solution to the associated min‐max optimization problem. An IRL algorithm is then developed to solve the game algebraic Riccati equation online without requiring complete knowledge of the system dynamics. The proposed IRL‐based algorithm solves an IRL Bellman equation in each iteration online in real time to evaluate an OPFB policy and updates the OPFB gain using the information given by the evaluated policy. An adaptive observer is used to provide the knowledge of the full states for the IRL Bellman equation during learning. However, the observer is not needed after the learning process is finished. A simulation example is provided to verify the convergence of the proposed algorithm to a suboptimal OPFB solution and the performance of the proposed method.  相似文献   

3.
This paper is devoted to the problem of robust H filtering for a class of uncertain switched neutral systems subject to stochastic disturbance and time‐varying delay. Attention is focused on the design of a full‐order switched filter such that the filtering error system is robust mean‐square exponentially stable with a prescribed weighted H performance. On the basis of the average dwell time approach and the piecewise Lyapunov function technique, sufficient conditions for the solvability of this problem are obtained in terms of linear matrix inequalities. Then, by solving the corresponding linear matrix inequalities, the desired full‐order switched filter is derived for all admissible uncertainties, time‐varying delay, and stochastic disturbances. A numerical example is given to illustrate the effectiveness of the proposed method. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

4.
5.
This paper considers the problem of robust delay‐dependent L2L filtering for a class of Takagi–Sugeno fuzzy systems with time‐varying delays. The purpose is to design a fuzzy filter such that both the robust stability and a prescribed L2L performance level of the filtering error system are guaranteed. A delay‐dependent sufficient condition for the solvability of the problem is obtained and a linear matrix inequality (LMI) approach is developed. A desired filter can be constructed by solving a set of LMIs. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
We discuss the use of [m,m]‐Padé approximants in the implementation of repetitive learning controls solving the output tracking problem (via output error feedback) in the presence of uncertain periodic reference and/or disturbance signals with known common period. The aim is to address the stability issues concerning those approximants when a linear learning controller—designed through a detailed stability proof (involving the use of a suitable Lyapunov‐like function) and described by a transfer function exhibiting all its poles with negative real part—is to be obtained as well as to evaluate the corresponding closed‐loop performances: robustness (for instance with respect to additive disturbance noises due to unmodeled sensor dynamics) is consequently achieved with improvements in the output tracking errors appearing as the approximation order m increases. Even though the case of any relative degree may be explicitly addressed, in this paper, for the sake of clarity, we restrict our attention to the learning problem for the class of single‐input, single‐output, minimum phase, time‐invariant systems with known relative degree ρ = 2, uncertain parameters and uncertain output‐dependent nonlinearities. Numerical simulation results illustrate the theoretical derivations. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
This paper is devoted to designing iterative learning control (ILC) for multiple‐input multiple‐output discrete‐time systems that are subject to random disturbances varying from iteration to iteration. Using the super‐vector approach to ILC, statistical expressions are presented for both expectation and variance of the tracking error, and time‐domain conditions are developed to ensure their asymptotic stability and monotonic convergence. It shows that time‐domain conditions can be tied together with an H‐based condition in the frequency domain by considering the properties of block Toeplitz matrices. This makes it possible to apply the linear matrix inequality technique to describe the convergence conditions and to obtain formulas for the control law design. Furthermore, the H‐based approach is shown applicable to ILC design regardless of the system relative degree, which can also be used to address issues of model uncertainty. For a class of systems with a relative degree of one, simulation tests are provided to illustrate the effectiveness of the H‐based approach to robust ILC design. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

8.
This paper presents a composite learning fuzzy control to synchronize two different uncertain incommensurate fractional‐order time‐varying delayed chaotic systems with unknown external disturbances and mismatched parametric uncertainties via the Takagi‐Sugeno fuzzy method. An adaptive controller together with fractional‐order composite learning laws is designed based on both a parallel distributed compensation technology and a fractional Lyapunov criterion. The boundedness of all variables in the closed‐loop system and the Mittag‐Leffler stability of tracking error can be guaranteed. T‐S fuzzy systems are provided to tackle unknown nonlinear functions. The distinctive features of the proposed approach consist in the following: (1) a supervisory control law is designed to compensate the lumped disturbances; (2) both the prediction error and the tracking error are used to estimate the unknown fuzzy system parameters; (3) parameter convergence can be ensured by an interval excitation condition. Finally, the feasibility of the proposed control strategy is demonstrated throughout an illustrative example.  相似文献   

9.
Hierarchical multi‐label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. Gene function prediction is a complicated HMC problem with large class number and usually strongly imbalanced class distributions. This paper proposes an improved HMC method based on over‐sampling and hierarchy constraint for solving the gene function prediction problem. The HMC task is transferred into a set of binary support vector machine (SVM) classification tasks. Then, two measures are implemented to enhance the HMC performance by introducing the hierarchy constraint into learning procedures. Firstly, for imbalanced classes, a hierarchical synthetic minority over‐sampling technique (SMOTE) is proposed as over‐sampling preprocessing to improve the SVM learning performance. Secondly, an improved True Path Rule (TPR) ensemble approach is introduced to combine the results of binary probabilistic SVM classifications. It can improve the classification results and guarantee the hierarchy constraint of classes. Experiment results on four benchmark FunCat Yeast datasets show that the proposed method significantly outperforms the basic TPR method and the Flat ensemble method. © 2012 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

10.
This paper focuses on solving the adaptive optimal tracking control problem for discrete‐time linear systems with unknown system dynamics using output feedback. A Q‐learning‐based optimal adaptive control scheme is presented to learn the feedback and feedforward control parameters of the optimal tracking control law. The optimal feedback parameters are learned using the proposed output feedback Q‐learning Bellman equation, whereas the estimation of the optimal feedforward control parameters is achieved using an adaptive algorithm that guarantees convergence to zero of the tracking error. The proposed method has the advantage that it is not affected by the exploration noise bias problem and does not require a discounting factor, relieving the two bottlenecks in the past works in achieving stability guarantee and optimal asymptotic tracking. Furthermore, the proposed scheme employs the experience replay technique for data‐driven learning, which is data efficient and relaxes the persistence of excitation requirement in learning the feedback control parameters. It is shown that the learned feedback control parameters converge to the optimal solution of the Riccati equation and the feedforward control parameters converge to the solution of the Sylvester equation. Simulation studies on two practical systems have been carried out to show the effectiveness of the proposed scheme.  相似文献   

11.
This paper is concerned with the problem of finite‐time H filtering for a class of Markovian jump systems subject to partial information on the transition probabilities. By introducing some slack matrix variables in terms of probability identity, a less conservative bounded real lemma is derived to ensure that filtering Markovian jump systems is finite‐time stable. Finally, the existence criterion of the desired filter is obtained such that the corresponding filtering error system is finite‐time bounded with a guaranteed H performance index. An example is given to illustrate the efficiency of the proposed method. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

12.
This paper is dealt with the fault detection (FD) problem for a class of network‐based nonlinear systems with communication constraints and random packet dropouts. The plant is described by a Takagi–Sugeno fuzzy time‐delay model, it has multiple sensors and only one of them is actually communicated with the FD filter at each transmission instant, and the packet dropouts occur randomly. The goal is to design a FD filter such that, for all unknown inputs, control inputs, time delays and incomplete data conditions, the estimation error between the residual and ‘fault’ (or, more generally, the weighted fault) is minimized. By casting the addressed FD problem into an auxiliary H filtering problem of a stochastic switched fuzzy time‐delay system, a sufficient condition for the existence of the desired FD filter is established in terms of linear matrix inequalities. A numerical example is provided to illustrate the effectiveness and applicability of the proposed technique. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

13.
This paper proposes a recurrent neural fuzzy controller (RNFC) approach based on a self‐organizing improved particle swarm optimization (SOIPSO) algorithm used for solving control problems. The proposed SOIPSO algorithm can adaptively determine the number of fuzzy rules and automatically adjust the parameters in an RNFC. The proposed learning algorithm consisted of phases of structure and parameter learning. Structure learning adopts several subswarms to constitute the adjustable variables in fuzzy systems, and an elite‐based structure strategy determines the suitable number of fuzzy rules. This paper proposes an improved particle swarm optimization technique, which consists of the modified evolutionary direction operator (MEDO) and traditional PSO techniques. The proposed MEDO method used the EDO and migration operation to improve the search ability of a global solution. Finally, the proposed RNFC approach based on the SOIPSO learning algorithm (RNFC–SOIPSO) was adopted to control a magnetic levitation system. Experimental results demonstrated that the proposed RNFC–SOIPSO model outperforms other models. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

14.
一种利用可加性模糊系统的短期负荷预测新方法   总被引:4,自引:2,他引:4  
该文依据可加性模糊系统理论,提出了一种新的负荷预测方法,利用聚类方法与有监督学习相结合的训练方法,提高了系统的函数逼近能力。仿真结果表明,系统学习速度快、预测精度高,在短期负荷预测中获得相当满意的结果。  相似文献   

15.
16.
This paper presents a neural‐network‐based finite‐time H control design technique for a class of extended Markov jump nonlinear systems. The considered stochastic character is described by a Markov process, but with only partially known transition jump rates. The sufficient conditions for the existence of the desired controller are derived in terms of linear matrix inequalities such that the closed‐loop system trajectory stays within a prescribed bound in a fixed time interval and has a guaranteed H noise attenuation performance for all admissible uncertainties and approximation errors of the neural networks. A numerical example is used to illustrate the effectiveness of the developed theoretic results. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

17.
This paper presents a neuro‐fuzzy network (NFN) where all its parameters can be tuned simultaneously using genetic algorithms (GAs). The approach combines the merits of fuzzy logic theory, neural networks and GAs. The proposed NFN does not require a priori knowledge about the system and eliminates the need for complicated design steps such as manual tuning of input–output membership functions, and selection of fuzzy rule base. Although, only conventional GAs have been used, convergence results are very encouraging. A well‐known numerical example derived from literature is used to evaluate and compare the performance of the network with other equalizing approaches. Simulation results show that the proposed neuro‐fuzzy controller, all parameters of which have been tuned simultaneously using GAs, offers advantages over existing equalizers and has improved performance. From the perspective of application and implementation, this paper is very interesting as it provides a new method for performing blind equalization. The main contribution of this paper is the use of learning algorithms to train a feed‐forward neural network for M‐ary QAM and PSK signals. This paper also provides a platform for researchers of the area for further development. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

18.
We propose a method to improve the performance of R‐learning, a reinforcement learning algorithm, by using multiple state‐action value tables. Unlike Q‐ or Sarsa learning, R‐learning learns a policy to maximize undiscounted rewards. Multiple state‐action value tables cause substantial explorations as needed and make R‐learning work well. Efficiency of the proposed method is verified through experiments in a simulated environment. © 2007 Wiley Periodicals, Inc. Electr Eng Jpn, 159(3): 34– 47, 2007; Published online in Wiley InterScience ( www.interscience. wiley.com ). DOI 10.1002/eej.20473  相似文献   

19.
In this study, we propose an extremum‐seeking approach for the approximation of optimal control problems for a class of unknown nonlinear dynamical systems. The technique combines a phasor extremum‐seeking controller with a reinforcement learning strategy. The learning approach is used to estimate the value function of an optimal control problem of interest. The phasor extremum‐seeking controller implements the approximate optimal controller. The approach is shown to provide reasonable approximations of optimal control problems without the need for a parameterization of the nonlinear system's dynamics. A simulation example is provided to demonstrate the effectiveness of the technique.  相似文献   

20.
In this paper, a robust exponential l2 ? l filtering problem is addressed for discrete‐time switched systems with polytopic uncertainties. The purpose of robust exponential l2 ? l filtering is to design a filter such that the resulting filtering error system is robustly exponentially stable with a decay rate and a prescribed exponential l2 ? l performance index. The robust exponential l2 ? l filtering problem is solved via an average dwell time approach. Sufficient conditions in terms of strict LMI are derived for checking the robust exponential stability of a filter. An explicit expression for the desired robust exponential filter is also given. Finally, a numerical example is provided to demonstrate the potential and effectiveness of the proposed method. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号