首页 | 本学科首页   官方微博 | 高级检索  
     

强化学习研究综述
引用本文:陈学松,杨宜民a.强化学习研究综述[J].计算机应用研究,2010,27(8):2834-2838.
作者姓名:陈学松  杨宜民a
作者单位:1. 广东工业大学自动化学院,广州,510006;广东工业大学应用数学学院,广州,510006
2. 广东工业大学自动化学院,广州,510006
摘    要:在未知环境中,关于agent的学习行为是一个既充满挑战又有趣的问题,强化学习通过试探与环境交互获得策略的改进,其学习和在线学习的特点使其成为机器学习研究的一个重要分支。介绍了强化学习在理论、算法和应用研究三个方面最新的研究成果,首先介绍了强化学习的环境模型和其基本要素;其次介绍了强化学习算法的收敛性和泛化有关的理论研究问题;然后结合最近几年的研究成果,综述了折扣型回报指标和平均回报指标强化学习算法;最后列举了强化学习在非线性控制、机器人控制、人工智能问题求解、多agent 系统问题等若干领域的成功应用和未来的发展方向。

关 键 词:强化学习  多智能体  马尔可夫决策过程

Reinforcement learning: survey of recent work
CHEN Xue-song,YANG Yi-mina.Reinforcement learning: survey of recent work[J].Application Research of Computers,2010,27(8):2834-2838.
Authors:CHEN Xue-song  YANG Yi-mina
Affiliation:(a.Faulty of Automation, b. Faculty of Applied Mathematics, Guangdong University of Technology, Guangzhou 510006, China)
Abstract:The problem of agent learning to act in an unknown world is both challenging and interesting. Reinforcement lear-ning has been successful at finding optimal control policies through trial-and-error interaction with dynamic environment. Its properties of self-improving and online learning make reinforcement learning become one of most important machine learning methods. The goal of this paper was to provide a comprehensive review of reinforcement learning about theory, algorithms and applications. First of all, this paper surveyed the foundation, model of environment of reinforcement learning. Discussed the convergence and generalization of the algorithms in the next. Then deeply discussed two representative selection of these algorithm, including discounted reward and average reward. Finally, provided some applications of reinforcement learning, and pointed out some challenges and problems of reinforcement learning.
Keywords:reinforcement learning  multi-agent systems  Markov decision processes
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号