首页 | 本学科首页   官方微博 | 高级检索  
     

双马尔可夫决策过程联合模型
引用本文:王蓁蓁,邢汉承. 双马尔可夫决策过程联合模型[J]. 计算机科学, 2009, 36(9): 161-166
作者姓名:王蓁蓁  邢汉承
作者单位:南京大学计算机科学与技术系,南京,210093;东南大学计算机科学与工程学院,南京,210096
基金项目:国家自然科学基金,江苏省自然科学基金 
摘    要:人类在处理问题中往往分为两个层次,首先在整体上把握问题,即提出大体方案,然后再具体实施.也就是说人类就是具有多分辨率智能系统的极好例子,他能够在多个层次上从底向上泛化(即看问题角度粒度变"粗",它类似于抽象),并且又能从顶向下进行实例化(即看问题角度变"细",它类似于具体化).由此构造了由在双层(理想空间即泛化和实际空间即实例化)上各自运行的马尔可夫决策过程组成的半马尔可夫决策过程,称之为双马尔可夫决策过程联合模型.然后讨论该联合模型的最优策略算法,最后给出一个实例说明双马尔可夫决策联合模型能够经济地节约"思想",是运算有效性和可行性的一个很好的折中.

关 键 词:马尔可夫决策过程  增强学习  最优策略
收稿时间:2008-10-27
修稿时间:2009-01-09

Associated Model of Bi-Markov Decision Processes
WANG Zhen-zhen,XING Han-cheng. Associated Model of Bi-Markov Decision Processes[J]. Computer Science, 2009, 36(9): 161-166
Authors:WANG Zhen-zhen  XING Han-cheng
Affiliation:Department of Computer Science & Technology;Nanjing University;Nanjing 210093;China;School of Computer Science & Engineering;Southeast University;Nanjing 210096;China
Abstract:Human thought is often divided two levels while dealing with problems.First people always treat problems from a whole perspective,i.e.,they have a general plan,then they specifically deal with details.The human itself is a good example for having a multi-resolutional characteristic.It can not only generalize bottom-up among multi-levels(the granule of viewpoint about problem becomes "rough",analogous to abstract),but also instantiate top-down(the granule of viewpoint becomes "thin",analogous to specificatio...
Keywords:Markov decision processes  Reinforcement learning  Optimal policy  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号