首页 | 本学科首页   官方微博 | 高级检索  
     检索      

在部分观测环境下的不确定动作模型学习
引用本文:饶东宁,蒋志华,姜云飞.在部分观测环境下的不确定动作模型学习[J].软件学报,2014,25(1):51-63.
作者姓名:饶东宁  蒋志华  姜云飞
作者单位:广东工业大学 计算机学院, 广东 广州 510090;暨南大学 信息科学与技术学院 计算机科学系, 广东 广州 510632;中山大学 信息科学与技术学院 软件研究所, 广东 广州 510275
基金项目:国家自然科学基金(61100134,61003179);广东省自然科学基金(S2011040001427)
摘    要:近年来,动作模型学习引起了研究人员的极大兴趣.可是,尽管不确定规划已经研究了十几年,动作模型学习的研究仍然集中于经典的确定性动作模型上.提出了在部分观测环境下学习不确定动作模型的算法,该算法可应用于假定人们对转移系统一无所知的情形下进行,输入只有动作-观测序列.在现实世界中,这样的场景很常见.致力于动作是由简单逻辑结构组成的、且观测以一定频率出现的一类问题的研究.学习过程分为3个步骤:首先,计算命题在状态中成立的概率;然后,将命题抽取成效果模式,再抽取前提;最后,对效果模式进行聚类以去除冗余.在基准领域上进行的实验结果表明,动作模型学习技术可推广到不确定的部分观测环境中.

关 键 词:人工智能  自动规划  动作模型学习  不确定动作  部分观测
收稿时间:2012/8/13 0:00:00
修稿时间:2013/1/25 0:00:00

Learning Partially Observable Non-Deterministic Action Models
RAO Dong-Ning,JIANG Zhi-Hua and JIANG Yun-Fei.Learning Partially Observable Non-Deterministic Action Models[J].Journal of Software,2014,25(1):51-63.
Authors:RAO Dong-Ning  JIANG Zhi-Hua and JIANG Yun-Fei
Institution:School of Computers, Guangdong University of Technology, Guangzhou 510090, China;Department of Computer Science, School of Information Science and Technology, Ji'nan University, Guangzhou 510632, China;Software Research Institute, School of Information Science and Technology, Sun Yat-Sen University, Guangzhou 510275, China
Abstract:Recently, interests in learning action models have been increasing. Although non-deterministic planning has been developed for several decades, most previous studies in the field of action model learning still focus on classical and deterministic action models. This paper presents an algorithm for identifying non-deterministic actions, including effects and preconditions, in partially observable domains. It can be applied when people know nothing about a transferring system and only the action-observation sequences are given. Such scenarios are common in real-world applications. This work focuses on problems in which actions are composed of simple logical structures and features are observed under some frequency. The learning process is divided into three steps: First, compute the probability of each proposition which holds in a state. Second, extract effect schema from propositions and then extract preconditions. Third, cluster effect schema to remove redundancy. Experimental results on benchmark domains show that action model learning is still useful in non-deterministic and partial observable environments.
Keywords:artificial intelligence  automated planning  learning action models  non-deterministic action  partial observability
本文献已被 CNKI 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号