共查询到20条相似文献,搜索用时 9 毫秒
1.
Gabriel Gómez-Pérez José D. Martín-Guerrero Emilio Soria-Olivas Emili Balaguer-Ballester Alberto Palomares Nicolás Casariego 《Expert systems with applications》2009,36(4):8022-8031
In this work, RL is used to find an optimal policy for a marketing campaign. Data show a complex characterization of state and action spaces. Two approaches are proposed to circumvent this problem. The first approach is based on the self-organizing map (SOM), which is used to aggregate states. The second approach uses a multilayer perceptron (MLP) to carry out a regression of the action-value function. The results indicate that both approaches can improve a targeted marketing campaign. Moreover, the SOM approach allows an intuitive interpretation of the results, and the MLP approach yields robust results with generalization capabilities. 相似文献
2.
In this paper, we propose fuzzy logic-based cooperative reinforcement learning for sharing knowledge among autonomous robots. The ultimate goal of this paper is to entice bio-insects towards desired goal areas using artificial robots without any human aid. To achieve this goal, we found an interaction mechanism using a specific odor source and performed simulations and experiments [1]. For efficient learning without human aid, we employ cooperative reinforcement learning in multi-agent domain. Additionally, we design a fuzzy logic-based expertise measurement system to enhance the learning ability. This structure enables the artificial robots to share knowledge while evaluating and measuring the performance of each robot. Through numerous experiments, the performance of the proposed learning algorithms is evaluated. 相似文献
3.
Adaptive neural network control for strict-feedback nonlinear systems using backstepping design 总被引:11,自引:0,他引:11
This paper focuses on adaptive control of strict-feedback nonlinear systems using multilayer neural networks (MNNs). By introducing a modified Lyapunov function, a smooth and singularity-free adaptive controller is firstly designed for a first-order plant. Then, an extension is made to high-order nonlinear systems using neural network approximation and adaptive backstepping techniques. The developed control scheme guarantees the uniform ultimate boundedness of the closed-loop adaptive systems. In addition, the relationship between the transient performance and the design parameters is explicitly given to guide the tuning of the controller. One important feature of the proposed NN controller is the highly structural property which makes it particularly suitable for parallel processing in actual implementation. Simulation studies are included to illustrate the effectiveness of the proposed approach. 相似文献
4.
This paper presents an adaptive neural control design for nonlinear pure-feedback systems with an input time-delay. Novel state variables and the corresponding transform are introduced, such that the state-feedback control of a pure-feedback system can be viewed as the output-feedback control of a canonical system. An adaptive predictor incorporated with a high-order neural network (HONN) observer is proposed to obtain the future system states predictions, which are used in the control design to circumvent the input delay and nonlinearities. The proposed predictor, observer and controller are all online implemented without iterative predictive calculations, and the closed-loop system stability is guaranteed. The conventional backstepping design and analysis for pure-feedback systems are avoided, which renders the developed scheme simpler in its synthesis and application. Practical guidelines on the control implementation and the parameter design are provided. Simulation on a continuous stirred tank reactor (CSTR) and practical experiments on a three-tank liquid level process control system are included to verify the reliability and effectiveness. 相似文献
5.
We present a neural method that computes the inverse kinematics of any kind of robot manipulators, both redundant and non-redundant. Inverse kinematics solutions are obtained through the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics. The inversion provides difference vectors in the joint space from difference vectors in the workspace. Our differential inverse kinematics (DIV) approach can be viewed as a neural network implementation of the Jacobian transpose method for arm kinematic control that does not require previous knowledge of the arm forward kinematics. Redundancy can be exploited to obtain a special inverse kinematic solution that meets a particular constraint (e.g. joint limit avoidance) by inverting an additional neural network The usefulness of our DIV approach is further illustrated with sensor-based multilink manipulators that learn collision-free reaching motions in unknown environments. For this task, the neural controller has two modules: a reinforcement-based action generator (AG) and a DIV module that computes goal vectors in the joint space. The actions given by the AG are interpreted with regard to those goal vectors. 相似文献
6.
The learning of complex control behaviour of autonomous mobile robots is one of the actual research topics. In this article an intelligent control architecture is presented which integrates learning methods and available domain knowledge. This control architecture is based on Reinforcement Learning and allows continuous input and output parameters, hierarchical learning, multiple goals, self-organized topology of the used networks and online learning. As a testbed this architecture is applied to the six-legged walking machine LAURON to learn leg control and leg coordination. 相似文献
7.
In this paper, a new formulation for the optimal tracking control problem (OTCP) of continuous-time nonlinear systems is presented. This formulation extends the integral reinforcement learning (IRL) technique, a method for solving optimal regulation problems, to learn the solution to the OTCP. Unlike existing solutions to the OTCP, the proposed method does not need to have or to identify knowledge of the system drift dynamics, and it also takes into account the input constraints a priori. An augmented system composed of the error system dynamics and the command generator dynamics is used to introduce a new nonquadratic discounted performance function for the OTCP. This encodes the input constrains into the optimization problem. A tracking Hamilton–Jacobi–Bellman (HJB) equation associated with this nonquadratic performance function is derived which gives the optimal control solution. An online IRL algorithm is presented to learn the solution to the tracking HJB equation without knowing the system drift dynamics. Convergence to a near-optimal control solution and stability of the whole system are shown under a persistence of excitation condition. Simulation examples are provided to show the effectiveness of the proposed method. 相似文献
8.
9.
Decentralized control of a class of large-scale nonlinear systems using neural networks 总被引:1,自引:0,他引:1
This paper designs a decentralized neural network (NN) controller for a class of nonlinear large-scale systems, in which strong interconnections are involved. NNs are used to handle unknown functions. The proposed scheme is proved guaranteeing the boundedness of the closed-loop subsystems using only local feedback signals. 相似文献
10.
A combination of multiple neural networks (NNs) is selected and used to model nonlinear multi-input multi-output (MIMO) processes with time delays. An optimisation procedure for a nonlinear model-predictive control (MPC) algorithm based on this model is then developed. The proposed scheme has been applied and evaluated for two example problems, including the MPC of a multi-component distillation column. 相似文献
11.
针对无线传感器网络面向移动汇聚节点的自适应路由问题,为实现路由过程中对节点能量以及计算、存储、通信资源的优化利用,并对数据传输时延和投递率等服务质量进行优化,提出一种基于强化学习的自适应路由方法,设计综合的奖赏函数以实现对能量、时延和投递率等多个指标的综合优化。从报文结构、路由初始化、路径选择等方面对路由协议进行详细设计,采用汇聚节点声明以及周期性洪泛机制加速收敛速度,从而支持汇聚节点的快速移动。理论分析表明基于强化学习的路由方法具备收敛快、协议开销低以及存储计算需求小等特点,能够适用于能量和资源受限的传感器节点。在仿真平台中通过性能评估和对比分析验证了所述自适应路由算法的可行性和优越性。 相似文献
12.
Adaptive neural network control for a class of uncertain nonlinear systems in pure-feedback form 总被引:1,自引:0,他引:1
Dan WangAuthor VitaeJie HuangAuthor Vitae 《Automatica》2002,38(8):1365-1372
A procedure is developed for the design of adaptive neural network controller for a class of SISO uncertain nonlinear systems in pure-feedback form. The design procedure is a combination of adaptive backstepping and neural network based design techniques. It is shown that, under appropriate assumptions, the solution of the closed-loop system is uniformly ultimately bounded. 相似文献
13.
The use of genetic algorithms to design neural networks for real-time control of flows in sewerage networks is discussed. In many control applications, standard supervised learning techniques (such as back-propagation) cannot be used through lack of training data. Reinforcement learning techniques, such as genetic algorithms, are a computationally-expensive but viable alternative if a simulator is available for the system in question. The paper briefly describes why genetic algorithms and neural networks were selected, then reports the results of a feasibility study. This demonstrates that the approach does indeed have merits. The implications of high computational cost are discussed, in terms of scaling up to significantly complex problems. 相似文献
14.
具有未知死区输入非线性系统的迭代学习控制 总被引:1,自引:0,他引:1
针对一类具有死区输入非线性系统,提出一种实现有限作业区间轨迹跟踪控制的神经网络迭代学习算法.基于Lyapunov-like方法设计学习控制器,回避了常规迭代学习控制中受控系统非线性特性需满足全局Lipschitz连续条件的要求.为处理输入死区,利用神经网络逼近这种强非线性特性;同时,通过对神经网络逼近误差界的估计并在控制器中设置补偿作用以消除其影响,从而提高系统的跟踪性能. 相似文献
15.
16.
While driving a vehicle safely at its handling limit is essential in autonomous vehicles in Level 5 autonomy, it is a very challenging task for current conventional methods. Therefore, this study proposes a novel controller of trajectory planning and motion control for autonomous driving through manifold corners at the handling limit to improve the speed and shorten the lap time of the vehicle. The proposed controller innovatively combines the advantages of conventional model-based control algorithm, model-free reinforcement learning algorithm, and prior expert knowledge, to improve the training efficiency for autonomous driving in extreme conditions. The reward shaping of this algorithm refers to the procedure and experience of race training of professional drivers in real time. After training on track maps that exhibit different levels of difficulty, the proposed controller implemented a superior strategy compared to the original reference trajectory, and can to other tougher maps based on the basic driving knowledge learned from the simpler map, which verifies its superiority and extensibility. We believe this technology can be further applied to daily life to expand the application scenarios and maneuvering envelopes of autonomous vehicles. 相似文献
17.
18.
A neural network model predictive controller 总被引:2,自引:0,他引:2
A neural network controller is applied to the optimal model predictive control of constrained nonlinear systems. The control law is represented by a neural network function approximator, which is trained to minimize a control-relevant cost function. The proposed procedure can be applied to construct controllers with arbitrary structures, such as optimal reduced-order controllers and decentralized controllers. 相似文献
19.
基于平衡学习的CMAC神经网络非线性滑模容错控制 总被引:2,自引:1,他引:1
以一改进的信度分配CMAC(cerebellar model articulation controllers)神经网络为在线故障诊断的手段,将变结构滑模摔制技术引入容错控制器设计之中,提出一种动态非线性系统主动容错控制方法.在常规CMAC学习算法中,误差被平均地分配给所有被激活的存储单元,不管各存储单元存储数据(权值)的可信程度.改进的CMAC中,利用激活单元先前学习次数作为可信度,其误差校正值与激活单元先前学习次数的-p次方成比例,从而提高神经网络的在线学习速度和精度;在此基础上利用滑模控制算法进行容错控制律的在线重构,实现动态非线性系统在线故障诊断与容错控制的集成.分析了系统的稳定性,仿真结果表明改进故障学习算法及容错控制的有效性. 相似文献
20.
Stanley C. Ahalt Prakoon Chen Cheng-Taou Chou Tzyy-Ping Jung 《The Journal of supercomputing》1992,5(4):307-330
We describe an implementation of a vector quantization codebook design algorithm based on the frequencysensitive competitive learning artificial neural network. The implementation, designed for use on high-performance computers, employs both multitasking and vectorization techniques. A C version of the algorithm tested on a CRAY Y-MP8/864 is discussed. We show how the implementation can be used to perform vector quantization, and demonstrate its use in compressing digital video image data. Two images are used, with various size codebooks, to test the performance of the implementation. The results show that the supercomputer techniques employed have significantly decreased the total execution time without affecting vector quantization performance.This work was supported by a Cray University Research Award and by NASA Lewis research grant number NAG3-1164. 相似文献