期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Maintenance planning using continuous-state partially observable Markov decision processes and non-linear action models

Roland Schöbi 《Structure and Infrastructure Engineering》2016,12(8):977-994

The signs of deterioration in worldwide infrastructure and the associated socio-economic and environmental losses call for sustainable resource management and policy-making. To this end, this work presents an enhanced variant of partially observable Markov decision processes (POMDPs) for the life cycle assessment and maintenance planning of infrastructure. POMDPs comprise a method, commonly employed in the field of robotics, for decision-making on the basis of uncertain observations. In the work presented herein, a continuous-state POMDP formulation is presented which is adapted to the problem of decision-making for optimal management of civil structures. The aforementioned problem may comprise non-linear and non-deterministic action and observation models. The continuous-state POMDP is herein coupled with a normalised unscented transform (NUT) in order to deliver a framework able to tackle non-linearities that likely characterise action models. The capabilities of this enhanced framework and its applicability to the maintenance planning problem are presented via two applications. In a first illustrative example, the use of the NUT is demonstrated within the framework of the value iteration algorithm. Next, the proposed continuous-state framework is compared against a discrete-state formulation for implementation on a life cycle assessment problem. 相似文献

2.

Generation of compensation behavior of autonomous robot for uncertainty of information with probabilistic flow control

Ryuichi Ueda 《Advanced Robotics》2013,27(11):721-734

相似文献

3.

Graphical models for interactive POMDPs: representations and solutions

Prashant Doshi Yifeng Zeng Qiongyu Chen 《Autonomous Agents and Multi-Agent Systems》2009,18(3):376-416

We develop new graphical representations for the problem of sequential decision making in partially observable multiagent environments, as formalized by interactive partially observable Markov decision processes (I-POMDPs). The graphical models called interactive influence diagrams (I-IDs) and their dynamic counterparts, interactive dynamic influence diagrams (I-DIDs), seek to explicitly model the structure that is often present in real-world problems by decomposing the situation into chance and decision variables, and the dependencies between the variables. I-DIDs generalize DIDs, which may be viewed as graphical representations of POMDPs, to multiagent settings in the same way that I-POMDPs generalize POMDPs. I-DIDs may be used to compute the policy of an agent given its belief as the agent acts and observes in a setting that is populated by other interacting agents. Using several examples, we show how I-IDs and I-DIDs may be applied and demonstrate their usefulness. We also show how the models may be solved using the standard algorithms that are applicable to DIDs. Solving I-DIDs exactly involves knowing the solutions of possible models of the other agents. The space of models grows exponentially with the number of time steps. We present a method of solving I-DIDs approximately by limiting the number of other agents’ candidate models at each time step to a constant. We do this by clustering models that are likely to be behaviorally equivalent and selecting a representative set from the clusters. We discuss the error bound of the approximation technique and demonstrate its empirical performance. 相似文献

4.

基于信念重用的WSNs能量高效跟踪

仵博吴敏郑红燕冯延蓬《传感器与微系统》2012,31(8):30-33

针对无线传感器网络(WSNs)中目标跟踪性能与传感器能量消耗难以平衡问题,提出一种信念重用的WSNs能量高效跟踪算法。使用部分可观察马尔可夫决策过程(POMDPs)对动态不确定环境下的WSNs进行建模,将跟踪性能与能量消耗平衡优化问题转化为POMDPs最优值函数求解过程;采用最大报酬值启发式查找方法获得跟踪性能的逼近最优值;采用信念重用方法避免重复获取信念,有效降低传感器通信带来的能量消耗。实验结果表明:信念重用算法能够有效优化跟踪性能与能量消耗之间的平衡,达到以较低的能量消耗获得较高跟踪性能的目的。相似文献

5.

A Modeling Approach to Maintenance Decisions Using Statistical Quality Control and Optimization

Julie Simmons Ivy Harriet Black Nembhard 《Quality and Reliability Engineering International》2005,21(4):355-366

Maintenance concerns impact systems in every industry and effective maintenance policies are important tools. We present a methodology for maintenance decision making for deteriorating systems under conditions of uncertainty that integrates statistical quality control (SQC) and partially observable Markov decision processes (POMDPs). We use simulation to develop realistic maintenance policies for real‐world environments. Specifically, we use SQC techniques to sample and represent real‐world systems. These techniques help define the observation distributions and structure for a POMDP. We propose a simulation methodology for integrating SQC and POMDPs in order to develop and valuate optimal maintenance policies as a function of process characteristics, system operating and maintenance costs. A two‐state machine replacement problem is used as an example of how the method can be applied. A simulation program developed using Visual Basic for Excel yields results on the optimal probability threshold and on the accuracy of the decisions as a function of the initial belief about the condition of the machine. This work lays a foundation for future research that will help bring maintenance decision models into practice. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

6.

一种基于独立任务的POMDP问题的解决方法_*

房俊恒朱斐刘全伏玉琛凌兴宏《计算机应用研究》2016,33(1)

部分可观测马尔可夫决策过程(POMDP)是马尔可夫决策过程(MDP)的扩展。通常利用POMDPs来模拟在部分可观测的随机环境中决策的Agents。针对完整POMDP的求解方法扩展能力弱的问题,提出把一个多元的POMDP分解成一组受限制的POMDPs,然后分别独立地求解每个这样的模型,获得一个值函数并将这些受限制的POMDPs的值函数结合起来以便获得一个完整POMDP的策略。该方法主要阐述了识别与独立任务相关的状态变量的过程,以及如何构造一个被限制在一个单独任务上的模型。将该方法应用到两个不同规模的岩石采样问题中,实验结果表明,该方法能够获得很好的策略。相似文献

7.

基于后验信念聚类的在线规划算法

仵博吴敏《计算机工程》2013,39(4)

在连续状态的部分可观察马尔可夫决策过程中,在线规划无法同时满足高实时性与低误差的要求.为此,提出一种基于后验信念聚类的在线规划算法.使用KL散度分析连续状态下后验信念之间的误差,根据误差分析结果对后验信念进行聚类,利用聚类后验信念计算报酬值,并采用分支界限裁剪方法裁剪后验信念与或树.实验结果表明,该算法能够有效降低求解问题的规模,消除重复计算,具有较好的实时性和较低的误差. 相似文献

8.

基于点的POMDPs在线值迭代算法

仵博吴敏佘锦华《软件学报》2013,24(1):25-36

部分可观察马尔可夫决策过程(partially observable Markov decision processes,简称POMDPs)是动态不确定环境下序贯决策的理想模型,但是现有离线算法陷入信念状态“维数灾”和“历史灾”问题,而现有在线算法无法同时满足低误差与高实时性的要求,造成理想的POMDPs模型无法在实际工程中得到应用.对此,提出一种基于点的POMDPs在线值迭代算法(point-based online value iteration,简称PBOVI).该算法在给定的可达信念状态点上进行更新操作,避免对整个信念状态空间单纯体进行求解,加速问题求解;采用分支界限裁剪方法对信念状态与或树进行在线裁剪;提出信念状态结点重用思想,重用上一时刻已求解出的信念状态点,避免重复计算.实验结果表明,该算法具有较低误差率、较快收敛性,满足系统实时性的要求. 相似文献

9.

部分可观测Markov环境下的激励学习综述

谢丽娟陈焕文《电工标准与质量》2002,17(2):23-27

对智能体在不确定环境下的学习与规划问题的激励学习技术进行了综述。首先介绍了用于描述隐状态问题的部分可观测Markov决策理论（POMDPs），在简单回顾其它POMDP求解技术后，重点讨论环境模型事先未知的激励学习技术，包括两类：一类为基于状态的值函数学习；一类为策略空间的直接搜索。最后分析了这些方法尚存在的问题，并指出了未来可能的研究方向。相似文献

10.

Planning to see: A hierarchical approach to planning visual actions on a robot using POMDPs

Mohan Sridharan Jeremy Wyatt Richard Dearden 《Artificial Intelligence》2010,174(11):704-725

Flexible, general-purpose robots need to autonomously tailor their sensing and information processing to the task at hand. We pose this challenge as the task of planning under uncertainty. In our domain, the goal is to plan a sequence of visual operators to apply on regions of interest (ROIs) in images of a scene, so that a human and a robot can jointly manipulate and converse about objects on a tabletop. We pose visual processing management as an instance of probabilistic sequential decision making, and specifically as a Partially Observable Markov Decision Process (POMDP). The POMDP formulation uses models that quantitatively capture the unreliability of the operators and enable a robot to reason precisely about the trade-offs between plan reliability and plan execution time. Since planning in practical-sized POMDPs is intractable, we partially ameliorate this intractability for visual processing by defining a novel hierarchical POMDP based on the cognitive requirements of the corresponding planning task. We compare our hierarchical POMDP planning system (HiPPo) with a non-hierarchical POMDP formulation and the Continual Planning (CP) framework that handles uncertainty in a qualitative manner. We show empirically that HiPPo and CP outperform the naive application of all visual operators on all ROIs. The key result is that the POMDP methods produce more robust plans than CP or the naive visual processing. In summary, visual processing problems represent a challenging domain for planning techniques and our hierarchical POMDP-based approach for visual processing management opens up a promising new line of research. 相似文献