首页 | 本学科首页   官方微博 | 高级检索  
     

基于神经网络的强化学习算法实现倒立摆控制
引用本文:张涛,吴汉生.基于神经网络的强化学习算法实现倒立摆控制[J].计算机仿真,2006,23(4):298-300,325.
作者姓名:张涛  吴汉生
作者单位:中国科学技术大学自动化系,安徽,合肥,230027
摘    要:运用强化学习的方法来对连续的倒立摆系统实现平衡控制是一直以来有待解决的问题。该文将Q学习与神经网络中的BP网络、S激活函数相结合,利用神经网络的泛化性能,设计出一种新的学习控制策略,通过迭代和学习过程,不但能够解决倒立摆系统连续状态空间的输入问题,还成功解决了输出连续动作空间的问题。将此方法运用于连续倒立摆系统的平衡控制中,经过基于实际控制模型的Matlab软件仿真实验,结果显示了这个方法的可行性。该方法进一步提高了强化学习理论在实际控制系统中的应用价值。

关 键 词:强化学习  神经网络  激活函数  泛化性能  连续动作空间
文章编号:1006-9348(2006)04-0298-03
收稿时间:2005-02-03
修稿时间:2005-02-03

Balance of an Inverted Pendulum Using Neural Network and Q-Learning
ZHANG Tao,WU Han-sheng.Balance of an Inverted Pendulum Using Neural Network and Q-Learning[J].Computer Simulation,2006,23(4):298-300,325.
Authors:ZHANG Tao  WU Han-sheng
Affiliation:Department of Automation, USTC, Hefei Anhui 230027, China
Abstract:How to balance a continuous inverted pendulum using reinforcement learning has been always a problem to be solved.This paper presents a new method combining Q-learning with BP network and sigmoid activation function,using neural network's generalization performance to solve not only the input of a continuous state space but also output as a continuous action space,which has been proved to be applicable by Matlab software simulation with real pendulum system model.This method enhanced the reinforcement learning's applicability in real control system.
Keywords:Reinforcement learning  Neural network  Activation function  Generalization performance  Continuous action space
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号