首页 | 本学科首页   官方微博 | 高级检索  
     

一种有限时段Markov决策过程的强化学习算法
引用本文:李春贵,刘永信.一种有限时段Markov决策过程的强化学习算法[J].广西工学院学报,2003,14(1):1-4.
作者姓名:李春贵  刘永信
作者单位:1. 广西工学院计算机系,广西,柳州,545006
2. 内蒙古大学自动化系,内蒙古,呼和浩特,010021
摘    要:研究有限时段非平稳的Markov决策过程的强化学习算法。通过引入一个人工吸收状态,把有限时段问题变为无限时段问题,从而可利用通常的强化学习方法来求解。在文献3]提出的算法思想基础上,提出了一种新的有限时段非平稳的Markov决策过程的强化学习算法,并用无完全模型的库存控制问题进行了实验。

关 键 词:强化学习  有限时段  Markov决策过程  无完全模型  库存控制  机器学习  非平稳
文章编号:1004-6410(2003)01-0001-04
修稿时间:2002年11月13

An algorithm of reinforcement learning for finite-horizon Markov decision processes
LI Chun gui ,LIU Yong xin.An algorithm of reinforcement learning for finite-horizon Markov decision processes[J].Journal of Guangxi University of Technology,2003,14(1):1-4.
Authors:LI Chun gui  LIU Yong xin
Affiliation:LI Chun gui 1,LIU Yong xin 2
Abstract:In this paper, reinforcement learning algorithms for finite horizon non stationary Markov decision processes are studied By introducing an artificialabsorbingstate, the finite horizon problem transforms to an infinite horizon one, so that a normal method of reinforcement learning algorithm can be used to solve the finite-horizon problem A new reinforcement learning algorithm of the finite horizon non stationary Markov decision process is put forward based on the algorithm thought presented by reference book\ And an experiment in inventory control problem with non complete model has been done
Keywords:reinforcementlearning  Markov decision process  non    stationary  inventory control
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号