首页 | 本学科首页   官方微博 | 高级检索  
     

强化学习研究综述
引用本文:高阳,陈世福,陆鑫.强化学习研究综述[J].自动化学报,2004,30(1):86-100.
作者姓名:高阳  陈世福  陆鑫
作者单位:1.南京大学计算机软件新技术国家重点实验室,南京
基金项目:国家自然科学基金(60103012,69905001),国家“973”重点研究发展规划(2002CB312002)资助~~
摘    要:摘要强化学习通过试错与环境交互获得策略的改进,其自学习和在线学习的特点使其成为 机器学习研究的一个重要分支.该文首先介绍强化学习的原理和结构;其次构造一个二维分类 图,分别在马尔可夫环境和非马尔可夫环境下讨论最优搜索型和经验强化型两类算法;然后结 合近年来的研究综述了强化学习技术的核心问题,包括部分感知、函数估计、多agent强化学 习,以及偏差技术;最后还简要介绍强化学习的应用情况和未来的发展方向.

关 键 词:强化学习    部分感知    函数估计    多agent强化学习
收稿时间:2002-11-4

Research on Reinforcement Learning Technology:A Review
GAO Yang,CHEN Shi-Fu,LU Xin.Research on Reinforcement Learning Technology:A Review[J].Acta Automatica Sinica,2004,30(1):86-100.
Authors:GAO Yang  CHEN Shi-Fu  LU Xin
Affiliation:1.State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing
Abstract:Reinforcement learning gets optimal policy through trial-and-error and interaction with dynamic environment. Its properties of self-improving and online learning make reinforcement learning become one of most important machine learning methods. In this paper, we firstly survey the foundation, structure and algorithms of reinforcement learning. We also discuss the exploration oriented algorithms and the exploitation oriented algorithms in Markov and non-Markov surroundings. Then we deeply discuss some key concepts of reinforcement learning, including partially observable environment, function approximation, multi-agent reinforcement learning and rule extraction from reinforcement learning. Finally, we briefly introduce some applications of reinforcement leaning and point out some directions of reinforcement learning.
Keywords:Reinforcement learning  partially observe  function approximation  multi-agent reinforcement learning
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号