期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Model-free control based on reinforcement learning for a wastewater treatment problem

S. Syafiie F. Tadeo E. Martinez T. Alvarez 《Applied Soft Computing》2011,11(1):73-82

This article presents a proposal, based on the model-free learning control (MFLC) approach, for the control of the advanced oxidation process in wastewater plants. This is prompted by the fact that many organic pollutants in industrial wastewaters are resistant to conventional biological treatments, and the fact that advanced oxidation processes, controlled with learning controllers measuring the oxidation–reduction potential (ORP), give a cost-effective solution. The proposed automation strategy denoted MFLC-MSA is based on the integration of reinforcement learning with multiple step actions. This enables the most adequate control strategy to be learned directly from the process response to selected control inputs. Thus, the proposed methodology is satisfactory for oxidation processes of wastewater treatment plants, where the development of an adequate model for control design is usually too costly. The algorithm proposed has been tested in a lab pilot plant, where phenolic wastewater is oxidized to carboxylic acids and carbon dioxide. The obtained experimental results show that the proposed MFLC-MSA strategy can achieve good performance to guarantee on-specification discharge at maximum degradation rate using readily available measurements such as pH and ORP, inferential measurements of oxidation kinetics and peroxide consumption, respectively. 相似文献

2.

基于平衡学习的CMAC神经网络非线性滑模容错控制 总被引：2，自引：1，他引：1

朱大奇孔敏《控制理论与应用》2008,25(1):81-86

以一改进的信度分配CMAC(cerebellar model articulation controllers)神经网络为在线故障诊断的手段,将变结构滑模摔制技术引入容错控制器设计之中,提出一种动态非线性系统主动容错控制方法.在常规CMAC学习算法中,误差被平均地分配给所有被激活的存储单元,不管各存储单元存储数据(权值)的可信程度.改进的CMAC中,利用激活单元先前学习次数作为可信度,其误差校正值与激活单元先前学习次数的-p次方成比例,从而提高神经网络的在线学习速度和精度;在此基础上利用滑模控制算法进行容错控制律的在线重构,实现动态非线性系统在线故障诊断与容错控制的集成.分析了系统的稳定性,仿真结果表明改进故障学习算法及容错控制的有效性. 相似文献

3.

Adaptive neural control for a class of stochastic nonlinear time‐delay systems with unknown dead zone using dynamic surface technique

下载免费PDF全文

Zifu Li Tieshan Li Gang Feng 《国际强度与非线性控制杂志
》2016,26(4):759-781

This paper investigates the problem of adaptive control for a class of stochastic nonlinear time‐delay systems with unknown dead zone. A neural network‐based adaptive control scheme is developed by using the dynamic surface control (DSC) technique and the minimal learning parameters algorithm. The dynamic surface control technique, which can avoid the problem of ‘explosion of complexity’ inherent in the conventional backstepping design procedure, is first extended to the stochastic nonlinear time‐delay system with unknown dead zone. The unknown nonlinearities are approximated by the function approximation technique using the radial basis function neural network. For the purpose of reducing the numbers of parameters, which are updated online for each subsystem in the process of approximating the unknown functions, the minimal learning parameters algorithm is then introduced. Also, the adverse effects of unknown time‐delay are removed by using the appropriate Lyapunov–Krasovskii functionals. In addition, the proposed control scheme is systematically derived without requiring any information on the boundedness of the dead zone parameters and avoids the possible controller singularity problem in the approximation‐based adaptive control schemes with feedback linearization technique. It is shown that the proposed control approach can guarantee that all the signals of the closed‐loop system are bounded in probability, and the tracking errors can be made arbitrary small by choosing the suitable design parameters. Finally, a simulation example is provided to illustrate the performance of the proposed control scheme. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

4.

Process control of pH neutralization based on adaptive algorithm of universal learning network 总被引：3，自引：0，他引：3

Min Bing Wei 《Journal of Process Control》2006,16(1):1-7

This paper presents an adaptive algorithm of universal learning network (ULN) and its application to identify pure time delay of a plant model. Universal learning network can be used in model predictive control for stabilizing a class of nonlinear systems with long time delay. Depending on ULN model with single neuron controller, the control architectures are introduced and applied to pH neutralization process. Simulation results prove the applicability and effectiveness of the ULN model. The general architecture and adaptive learning algorithm give ULN more representing abilities to model and control the nonlinear black box systems with long time delay. 相似文献

5.

An on-line learning neural controller for helicopters performing highly nonlinear maneuvers 总被引：1，自引：0，他引：1

S. SureshAuthor Vitae N. Sundararajan^{Author Vitae} 《Applied Soft Computing》2012,12(1):360-371

This paper presents an on-line learning adaptive neural control scheme for helicopters performing highly nonlinear maneuvers. The online learning adaptive neural controller compensates the nonlinearities in the system and uncertainties in the modeling of the dynamics to provide the desired performance. The control strategy uses a neural controller aiding an existing conventional controller. The neural controller is based on a online learning dynamic radial basis function network, which uses a Lyapunov based on-line parameter update rule integrated with a neuron growth and pruning criteria. The online learning dynamic radial basis function network does not require a priori training and also it develops a compact network for implementation. The proposed adaptive law provides necessary global stability and better tracking performance. Simulation studies have been carried-out using a nonlinear (desktop) simulation model similar to that of a BO105 helicopter. The performances of the proposed adaptive controller clearly shows that it is very effective when the helicopter is performing highly nonlinear maneuvers. Finally, the robustness of the controller has been evaluated using the attitude quickness parameters (handling quality index) at different speed and flight conditions. The results indicate that the proposed online learning neural controller adapts faster and provides the necessary tracking performance for the helicopter executing highly nonlinear maneuvers. 相似文献

6.

A new control method of nonlinear systems based on impulseresponses of universal learning networks

Hirasawa K. Hu J. Murata J. Jin C. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2001,31(3):362-372

A new control method of nonlinear dynamic systems is proposed based on the impulse responses of universal learning networks (ULNs), ULNs form a superset of neural networks. They consist of a number of interconnected nodes where the nodes may have any continuously differentiable nonlinear functions in them and each pair of nodes can be connected by multiple branches with arbitrary time delays. A generalized learning algorithm is derived for the ULNs, in which both the first order derivatives (gradients) and the higher order derivatives are incorporated. One of the distinguished features of the proposed control method is that the impulse response of the systems is considered as an extended part of the criterion function and it can be calculated by using the higher order derivatives of ULNs. By using the impulse response as the criterion function, nonlinear dynamics with not only quick response but also quick damping and small steady state error can be more easily obtained than the conventional nonlinear control systems with quadratic form criterion functions of state and control variables. 相似文献

7.

Reinforcement learning in continuous time and space 总被引：2，自引：0，他引：2

Doya K 《Neural computation》2000,12(1):219-245

This article presents a reinforcement learning framework for continuous-time dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinite-horizon, discounted reward problems, we derive algorithms for estimating value functions and improving policies with the use of function approximators. The process of value function estimation is formulated as the minimization of a continuous-time form of the temporal difference (TD) error. Update methods based on backward Euler approximation and exponential eligibility traces are derived, and their correspondences with the conventional residual gradient, TD(0), and TD(lambda) algorithms are shown. For policy improvement, two methods-a continuous actor-critic method and a value-gradient-based greedy policy-are formulated. As a special case of the latter, a nonlinear feedback control law using the value gradient and the model of the input gain is derived. The advantage updating, a model-free algorithm derived previously, is also formulated in the HJB-based framework. The performance of the proposed algorithms is first tested in a nonlinear control task of swinging a pendulum up with limited torque. It is shown in the simulations that (1) the task is accomplished by the continuous actor-critic method in a number of trials several times fewer than by the conventional discrete actor-critic method; (2) among the continuous policy update methods, the value-gradient-based policy with a known or learned dynamic model performs several times better than the actor-critic method; and (3) a value function update using exponential eligibility traces is more efficient and stable than that based on Euler approximation. The algorithms are then tested in a higher-dimensional task: cart-pole swing-up. This task is accomplished in several hundred trials using the value-gradient-based policy with a learned dynamic model. 相似文献

8.

Online learning control by association and reinforcement 总被引：4，自引：0，他引：4

Si J. Yu-Tsung Wang 《Neural Networks, IEEE Transactions on》2001,12(2):264-276

This paper focuses on a systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or more specifically neural dynamic programming. This online learning system improves its performance over time in two aspects: 1) it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance; and 2) system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. A successful candidate of online learning control design is introduced. Real-time learning algorithms is derived for individual components in the learning system. Some analytical insight is provided to give guidelines on the learning process took place in each module of the online learning control system. 相似文献

9.

Iterative learning control applied to batch processes: An overview

《Control Engineering Practice》2007,15(10):1306-1318

With the recent emphasis on batch processing by emerging industries like the microelectronics and biotechnology, the interest in batch process control has been renewed. This paper gives an overview of the iterative learning control (ILC) technique, which can be used to improve tracking control performance in batch processes. The fundamental concepts and review of the various ILC algorithms are presented, with a particular focus on a model-based algorithm called Q-ILC and an application involving a rapid thermal processing (RTP) system. The study indicates that one can solve a seemingly very difficult multivariable nonlinear tracking problem with relative ease by intelligently combining the ILC technique with basic process insights and standard system identification techniques. Some related techniques in the literature are brought forth with the hope of unifying them. We aslo suggest some remaining challenges. 相似文献

10.

基于迭代扩张状态观测器的数据驱动最优迭代学习控制

惠宇池荣虎《控制理论与应用》2018,35(11):1672-1679

针对一类带扰动有限时间内重复运行的离散时间非线性非仿射不确定系统,本文提出了一种基于迭代扩张状态观测器的数据驱动最优迭代学习控制方法.首先,提出了改进的迭代动态线性化方法,将被控系统线性化为与控制输入有关的仿射形式,并将不确定性合并到一个非线性项中;然后,设计了迭代扩张状态观测器对非线性不确定项进行估计,作为对扰动的补偿;最后,设计了性能指标函数,通过最优技术,提出了参数迭代更新律和最优学习控制律.本文通过数学分析,证明了跟踪误差的有界收敛性.仿真结果验证了方法的有效性.所提出的新型迭代动态线性化方法可很大程度上降低线性化后的控制增益的动态复杂性,使其易于估计.所提出的迭代扩张状态观测器可以在重复中学习,对非重复扰动可进行有效的估计.此外,本文控制器的设计与分析是数据驱动的控制方法,除了被控系统的输入输出数据以外,不需要任何其他模型信息. 相似文献

11.

严格反馈系统的事件触发学习控制

王敏胡锐辛学刚时昊天《控制理论与应用》2021,38(10):1577-1586

本文针对一类严格反馈非线性系统,提出了基于确定学习的事件触发控制方案.首先,在本地控制测试端设计自适应神经网络控制,并在控制过程中实现系统未知动态的知识获取和存储.随后,基于常值权值,设计了新颖的事件触发控制器和事件触发条件.结合李雅普诺夫稳定性分析和非线性脉冲动态系统原理,验证了所提方案能够保证跟踪误差收敛到零的小邻域内以及所有闭环信号是最终一致有界的.此外,本文所提方案采用常值权值代替了估计权值,使得所提方案易于实现,暂态性能好和网络资源占用少.最后,通过对比仿真结果证明了所提方案的有效性. 相似文献

12.

Neural critic learning toward robust dynamic stabilization

Ding Wang Xin Xu Mingming Zhao 《国际强度与非线性控制杂志
》2020,30(5):2020-2032

In this article, we focus on developing a neural‐network‐based critic learning strategy toward robust dynamic stabilization for a class of uncertain nonlinear systems. A type of general uncertainties involved both in the internal dynamics and in the input matrix is considered. An auxiliary system with actual action and auxiliary signal is constructed after dynamics decomposition and combination for the original plant. The reasonability of the control problem transformation from robust stabilization to optimal feedback design is also provided theoretically. After that, the adaptive critic learning method based on a neural network is established to derive the approximate optimal solution of the transformed control problem. The critic weight can be initialized to a zero vector, which apparently facilitates the learning process. Numerical simulation is finally presented to illustrate the effectiveness of the critic learning approach for neural robust stabilization. 相似文献

13.

智能控制：从学习控制到平行控制

王飞跃魏庆来《控制理论与应用》2018,35(7):939-948

20世纪60年代,学习控制开启了人类探究复杂系统控制的新途径,基于人工智能技术的智能控制随之兴起.本文以智能控制为主线,阐述其由学习控制向平行控制发展的历程.本文首先介绍学习控制的基本思想,描述了智能机器的架构设计与运行机理.随着信息科技的进步,基于数据的计算智能方法随之出现.对此,本文进一步简述了基于计算智能的学习控制方法,并以自适应动态规划方法为切入点分析非线性动态系统自学习优化问题的求解过程.最后,针对工程复杂性与社会复杂性互相耦合的复杂系统控制问题,阐述了基于平行控制的学习与优化方法求解思路,分析其在求解复杂系统优化控制问题方面的优势.智能控制思想经历了学习控制、计算智能控制到平行控制的演化过程,可以看出平行控制是实现复杂系统知识自动化的有效方法. 相似文献

14.

Adaptive neural control and learning of affine nonlinear systems

Yuxiang Wu Cong Wang 《Neural computing & applications》2014,25(2):309-319

This paper presents deterministic learning from adaptive neural network control of affine nonlinear systems with completely unknown system dynamics. Thanks to the learning capability of radial basis function, neural network (NN), stable adaptive NN controller is designed for the unknown affine nonlinear systems. The designed adaptive NN controller is rigorously shown that learning of the unknown closed-loop system dynamics can be achieved during the stable control process because partial persistent excitation condition of some internal signals in the closed-loop system is satisfied. Subsequently, neural learning controller using the knowledge obtained from deterministic learning is constructed to achieve closed-loop stability and improve control performance. Numerical simulation is provided to show the effectiveness of the proposed control scheme. 相似文献

15.

一类非线性系统的误差轨迹跟踪鲁棒学习控制算法

严求真孙明轩《控制理论与应用》2013,30(1):23-30

针对一类含非参数不确定性的非线性系统,提出一种鲁棒迭代学习控制算法,该算法放宽了常规迭代学习控制方法的初始定位条件,迭代初值可任意取值.基于类Lyapunov方法设计误差轨迹跟踪控制器,通过鲁棒限幅学习机制对不确定性进行估计和补偿,能够在整个作业区间上实现误差对给定期望误差轨迹的精确跟踪,期望误差轨迹根据迭代起始时刻的误差值设置.利用期望误差轨迹的衰减性状,可使系统误差在预设的时间点后收敛于原点的邻域内,邻域半径的大小可根据需要任意设置.理论分析和仿真结果表明了控制方法的有效性. 相似文献

16.

基于过程神经元网络的动态预测模型及其应用

许少华王兵何新贵《信息与控制》2007,36(6):0-753

〗针对动态系统过程预测预报问题，提出了一种基于过程神经元网络的动态预测方法．过程神经元网络的输入/输出均可以是时变函数，其时空聚合运算和激励可同时反映时变输入信号的空间聚合作用和输入过程中的阶段时间累积效应．基于过程神经元网络的动态预测模型能同时满足对系统的非线性辨识和过程预测，在机制上对动态预测预报问题有较好的适应性．文中给出了基于函数基展开和梯度下降法的学习算法，以电力负荷预报为例验证了模型和算法的有效性．相似文献

17.

Online genetic-ANFIS temperature control for advanced microwave biodiesel reactor

W.A. Wali A.I. Al-Shamma’a Kadhim H. Hassan J.D. Cullen 《Journal of Process Control》2012,22(7):1256-1272

Biofuels, such as biodiesel, are good for the environment because they add fewer emissions to the atmosphere than petroleum-based fuel. Conventional biodiesel processes are mainly based on use of high power thermal heating to produce biodiesel from pure or waste feedstock such as virgin vegetable oils or waste cooking oils. The development of a novel continuous microwave biodiesel reactor for the conversion of waste oil and fats into biodiesel is reported. This process has the capability to enhance the production of biodiesel in a very short time as compared with conventional methods that require lengthy hours and days. Real time monitoring and control process in microwave biodiesel reactor is necessary to adjust the applied power of microwave reactor under different perturbations for the process temperature control, and full system real time monitoring. The paper focuses on an artificial intelligence technique to design online genetic-ANFIS temperature control based on LabVIEW. The designed controller was compared with error-based Adaptive controller to explore the robustness of the proposed controller in nonlinear real time application. 相似文献

18.

Neurocontroller design via supervised and unsupervised learning 总被引：1，自引：0，他引：1

Allon Guez John Selinsky 《Journal of Intelligent and Robotic Systems》1989,2(2-3):307-335

In this paper we study the role of supervised and unsupervised neural learning schemes in the adaptive control of nonlinear dynamic systems. We suggest and demonstrate that the teacher's knowledge in the supervised learning mode includes a-priori plant sturctural knowledge which may be employed in the design of exploratory schedules during learning that results in an unsupervised learning scheme. We further demonstrate that neurocontrollers may realize both linear and nonlinear control laws that are given explicitly in an automated teacher or implicitly through a human operator and that their robustness may be superior to that of a model based controller. Examples of both learning schemes are provided in the adaptive control of robot manipulators and a cart-pole system. 相似文献

19.

Model-free adaptive control design using evolutionary-neural compensator 总被引：4，自引：0，他引：4

Leandro dos Santos Coelho Marcelo Wicthoff Pessa Rodrigo Rodrigues Sumar Antonio Augusto Rodrigues Coelho 《Expert systems with applications》2010,37(1):499-508

It is well-known that conventional control theories are widely suited for applications where the processes can be reasonably described in advance. However, when the plant’s dynamics are hard to characterize precisely or are subject to environmental uncertainties, one may encounter difficulties in applying the conventional controller design methodologies. In this case, an alternative design is a model-free learning adaptive control (MFLAC), based on pseudo-gradient concepts with compensation using a radial basis function neural network and optimization approach with differential evolution technique presented in this paper. Motivation for developing a new approach is to overcome the limitation of the conventional MFLAC design, which cannot guarantee satisfactory control performance when the nonlinear process has different gains for the operational range. Robustness of the MFLAC with evolutionary-neural compensation scheme is compared to the MFLAC without compensation. Simulation results for a nonlinear chemical reactor and nonlinear control valve are given to show the advantages of the proposed evolutionary-neural compensator for MFLAC design. 相似文献

20.

Neural network based iterative learning predictive control design for mechatronic systems with isolated nonlinearity

Ridong Zhang Anke Xue Jianzhong Wang Shuqing Wang Zhengyun Ren 《Journal of Process Control》2009,19(1):68-74

The paper presents a new nonlinear predictive control design for a kind of nonlinear mechatronic drive systems, which leads to the improvement of regulatory capacity for both reference input tracking and load disturbance rejection. The nonlinear system is first treated into an equal linear time-variant system plus a nonlinear part using a neural network, then an iterative learning linear predictive controller is developed with a similar structure of PI optimal regulator and with setpoint feed forward control. Because the overall control law is a linear one, this design gives a direct and also effective multi-step prediction method and avoids the complicated nonlinear optimization. The control law is also an accurate one compared with traditional linearized method. Besides, changes of the system state variables are considered in the objective function with control performance superior to conventional state space predictive control designs which only consider the predicted output errors. The proposed method is compared with conventional state space predictive control method and classical PI optimal control method. Tracking performance, robustness and disturbance rejection are enlightened. 相似文献