在部分观测环境下的不确定动作模型学习 Learning Partially Observable Non-Deterministic Action Models期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

在部分观测环境下的不确定动作模型学习

引用本文：	饶东宁,蒋志华,姜云飞.在部分观测环境下的不确定动作模型学习[J].软件学报,2014,25(1):51-63.

作者姓名：	饶东宁蒋志华姜云飞

作者单位：	广东工业大学计算机学院, 广东广州 510090;暨南大学信息科学与技术学院计算机科学系, 广东广州 510632;中山大学信息科学与技术学院软件研究所, 广东广州 510275

基金项目：	国家自然科学基金（61100134，61003179）；广东省自然科学基金（S2011040001427）

摘要：	近年来，动作模型学习引起了研究人员的极大兴趣.可是，尽管不确定规划已经研究了十几年，动作模型学习的研究仍然集中于经典的确定性动作模型上.提出了在部分观测环境下学习不确定动作模型的算法，该算法可应用于假定人们对转移系统一无所知的情形下进行，输入只有动作-观测序列.在现实世界中，这样的场景很常见.致力于动作是由简单逻辑结构组成的、且观测以一定频率出现的一类问题的研究.学习过程分为3个步骤：首先，计算命题在状态中成立的概率；然后，将命题抽取成效果模式，再抽取前提；最后，对效果模式进行聚类以去除冗余.在基准领域上进行的实验结果表明，动作模型学习技术可推广到不确定的部分观测环境中.
关键词：	人工智能自动规划动作模型学习不确定动作部分观测
收稿时间：	2012/8/13 0:00:00
修稿时间：	2013/1/25 0:00:00
Learning Partially Observable Non-Deterministic Action Models

RAO Dong-Ning,JIANG Zhi-Hua and JIANG Yun-Fei.Learning Partially Observable Non-Deterministic Action Models[J].Journal of Software,2014,25(1):51-63.

Authors:	RAO Dong-Ning JIANG Zhi-Hua and JIANG Yun-Fei

Affiliation:	School of Computers, Guangdong University of Technology, Guangzhou 510090, China;Department of Computer Science, School of Information Science and Technology, Ji'nan University, Guangzhou 510632, China;Software Research Institute, School of Information Science and Technology, Sun Yat-Sen University, Guangzhou 510275, China

Abstract:	Recently, interests in learning action models have been increasing. Although non-deterministic planning has been developed for several decades, most previous studies in the field of action model learning still focus on classical and deterministic action models. This paper presents an algorithm for identifying non-deterministic actions, including effects and preconditions, in partially observable domains. It can be applied when people know nothing about a transferring system and only the action-observation sequences are given. Such scenarios are common in real-world applications. This work focuses on problems in which actions are composed of simple logical structures and features are observed under some frequency. The learning process is divided into three steps: First, compute the probability of each proposition which holds in a state. Second, extract effect schema from propositions and then extract preconditions. Third, cluster effect schema to remove redundancy. Experimental results on benchmark domains show that action model learning is still useful in non-deterministic and partial observable environments.

Keywords:	artificial intelligence automated planning learning action models non-deterministic action partial observability
本文献已被 CNKI 等数据库收录！
	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏