基于一阶信念点的一阶POMDP值迭代算法研究 Research on first-order belief point-based value iteration for FO-POMDP期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于一阶信念点的一阶POMDP值迭代算法研究

引用本文：	陈丽娜,黄宏斌,邓苏.基于一阶信念点的一阶POMDP值迭代算法研究[J].计算机工程与应用,2012,48(15):7-11.

作者姓名：	陈丽娜黄宏斌邓苏

作者单位：	国防科技大学信息系统工程重点实验室，长沙 410073

基金项目：	国家自然科学基金(No.71071160)

摘要：	主要研究一阶部分可观测马尔可夫决策过程的近似求解方法。给出了一阶信念、一阶信念粒度、流关键度的概念；提出了基于流关键度的粒度归结方法，统一一阶信念粒度；提出了一阶信念粒度距离度量方法，提出FO-PBVI方法，将PBVI提升到抽象层面。通过Tiger和Tag实验对方法进行了验证分析，通过实验可见FO-PBVI方法能够很好地适应问题规模的变化，能够求解较大规模的规划问题。
关键词：	一阶部分可观测马尔可夫决策过程（POMDP）一阶信念状态粒度归结值迭代
Research on first-order belief point-based value iteration for FO-POMDP

CHEN Lina , HUANG Hongbin , DENG Su.Research on first-order belief point-based value iteration for FO-POMDP[J].Computer Engineering and Applications,2012,48(15):7-11.

Authors:	CHEN Lina HUANG Hongbin DENG Su

Affiliation:	Key Lab of Information System Engineering, National University of Defense Technology, Changsha 410073, China

Abstract:	The approximate algorithm of FO-POMDP is an important problem. This paper studies the approximate algorithm of FO-POMDP. The concepts of the first-order belief state, the granularity of belief state, and the degree of fluent are proposed. The method of granularity resolution is presented which can convert the granularity of belief states. The distance of different first-order belief states is also presented. The PBVI is extended to the logic level, and it is FO-PBVI. Experiments on FO-PBVI show that, FO-PBVI is efficient in solving the problems whose scale is large.

Keywords:	First Order-Partially-Observable Markov Decision Processes(FO-POMDP) First Order(FO)-belief state granularity resolution value iteration
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏