首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Verification of reachability properties for probabilistic systems is usually based on variants of Markov processes. Current methods assume an exact model of the dynamic behavior and are not suitable for realistic systems that operate in the presence of uncertainty and variability. This research note extends existing methods for Bounded-parameter Markov Decision Processes (BMDPs) to solve the reachability problem. BMDPs are a generalization of MDPs that allows modeling uncertainty. Our results show that interval value iteration converges in the case of an undiscounted reward criterion that is required to formulate the problems of maximizing the probability of reaching a set of desirable states or minimizing the probability of reaching an unsafe set. Analysis of the computational complexity is also presented.  相似文献   

2.
3.
In a spoken dialog system, determining which action a machine should take in a given situation is a difficult problem because automatic speech recognition is unreliable and hence the state of the conversation can never be known with certainty. Much of the research in spoken dialog systems centres on mitigating this uncertainty and recent work has focussed on three largely disparate techniques: parallel dialog state hypotheses, local use of confidence scores, and automated planning. While in isolation each of these approaches can improve action selection, taken together they currently lack a unified statistical framework that admits global optimization. In this paper we cast a spoken dialog system as a partially observable Markov decision process (POMDP). We show how this formulation unifies and extends existing techniques to form a single principled framework. A number of illustrations are used to show qualitatively the potential benefits of POMDPs compared to existing techniques, and empirical results from dialog simulations are presented which demonstrate significant quantitative gains. Finally, some of the key challenges to advancing this method – in particular scalability – are briefly outlined.  相似文献   

4.
Adaptive sensing involves actively managing sensor resources to achieve a sensing task, such as object detection, classification, and tracking, and represents a promising direction for new applications of discrete event system methods. We describe an approach to adaptive sensing based on approximately solving a partially observable Markov decision process (POMDP) formulation of the problem. Such approximations are necessary because of the very large state space involved in practical adaptive sensing problems, precluding exact computation of optimal solutions. We review the theory of POMDPs and show how the theory applies to adaptive sensing problems. We then describe a variety of approximation methods, with examples to illustrate their application in adaptive sensing. The examples also demonstrate the gains that are possible from nonmyopic methods relative to myopic methods, and highlight some insights into the dependence of such gains on the sensing resources and environment.
Alfred O. Hero IIIEmail:

Edwin K. P. Chong   received the BE(Hons) degree with First Class Honors from the University of Adelaide, South Australia, in 1987; and the MA and PhD degrees in 1989 and 1991, respectively, both from Princeton University, where he held an IBM Fellowship. He joined the School of Electrical and Computer Engineering at Purdue University in 1991, where he was named a University Faculty Scholar in 1999, and was promoted to Professor in 2001. Since August 2001, he has been a Professor of Electrical and Computer Engineering and a Professor of Mathematics at Colorado State University. His research interests span the areas of communication and sensor networks, stochastic modeling and control, and optimization methods. He coauthored the recent best-selling book, An Introduction to Optimization, 3rd Edition, Wiley-Interscience, 2008. He is currently on the editorial board of the IEEE Transactions on Automatic Control, Computer Networks, Journal of Control Science and Engineering, and IEEE Expert Now. He is a Fellow of the IEEE, and served as an IEEE Control Systems Society Distinguished Lecturer. He received the NSF CAREER Award in 1995 and the ASEE Frederick Emmons Terman Award in 1998. He was a co-recipient of the 2004 Best Paper Award for a paper in the journal Computer Networks. He has served as Principal Investigator for numerous funded projects from NSF, DARPA, and other funding agencies. Christopher M. Kreucher   received the BS, MS, and PhD degrees in Electrical Engineering from the University of Michigan in 1997, 1998, and 2005, respectively. He is currently a Senior Systems Engineer at Integrity Applications Incorporated in Ann Arbor, Michigan. His current research interests include nonlinear filtering (specifically particle filtering), Bayesian methods of fusion and multitarget tracking, self localization, information theoretic sensor management, and distributed swarm management. Alfred O. Hero III   received the BS (summa cum laude) from Boston University (1980) and the PhD from Princeton University (1984), both in Electrical Engineering. Since 1984 he has been with the University of Michigan, Ann Arbor, where he is a Professor in the Department of Electrical Engineering and Computer Science and, by courtesy, in the Department of Biomedical Engineering and the Department of Statistics. He has held visiting positions at Massachusetts Institute of Technology (2006), Boston University, I3S University of Nice, Sophia-Antipolis, France (2001), Ecole Normale Superieure de Lyon (1999), Ecole Nationale Superieure des Telecommunications, Paris (1999), Scientific Research Labs of the Ford Motor Company, Dearborn, Michigan (1993), Ecole Nationale Superieure des Techniques Avancees (ENSTA), Ecole Superieure d’Electricite, Paris (1990), and M.I.T. Lincoln Laboratory (1987–1989). His recent research interests have been in areas including: inference for sensor networks, adaptive sensing, bioinformatics, inverse problems. and statistical signal and image processing. He is a Fellow of the Institute of Electrical and Electronics Engineers (IEEE), a member of Tau Beta Pi, the American Statistical Association (ASA), the Society for Industrial and Applied Mathematics (SIAM), and the US National Commission (Commission C) of the International Union of Radio Science (URSI). He has received a IEEE Signal Processing Society Meritorious Service Award (1998), IEEE Signal Processing Society Best Paper Award (1998), a IEEE Third Millenium Medal and a 2002 IEEE Signal Processing Society Distinguished Lecturership. He was President of the IEEE Signal Processing Society (2006–2007) and during his term served on the TAB Periodicals Committee (2006). He was a member of the IEEE TAB Society Review Committee (2008) and is Director-elect of IEEE for Division IX (2009).   相似文献   

5.
Discrete-event systems modeled as continuous-time Markov processes and characterized by some integer-valued parameter are considered. The problem addressed is that of estimating performance sensitivities with respect to this parameter by directly observing a single sample path of the system. The approach is based on transforming the nominal Markov chain into a reduced augmented chain, the stationary-state probabilities which can be easily combined to obtain stationary-state probability sensitivities with respect to the given parameter. Under certain conditions, the reduced augmented chain state transitions are observable with respect to the state transitions of the system itself, and no knowledge of the nominal Markov-chain state of the transition rates is required. Applications for some queueing systems are included. The approach incorporates estimation of unknown transition rates when needed and is extended to real-valued parameters  相似文献   

6.
汤俏  赵凯 《计算机科学》2004,31(Z2):162-165
1引言 在人工智能领域中,增强学习理论由于其自学习性和自适应性的优点而得到了广泛关注,在机器人控制系统,优化组合问题等诸多领域得到了越来越广泛的应用,是当前研究的重点问题之一[1].现有的增强学习方法对马尔可夫决策过程(MDP,Markov Decision Processes),即,进行策略选择的agent能够准确全面地获得关于环境所有信息的情况,已经有了多种较成熟的算法,如Q-learning等[2,3].  相似文献   

7.
We propose a novel approach, called parallel rollout, to solving (partially observable) Markov decision processes. Our approach generalizes the rollout algorithm of Bertsekas and Castanon (1999) by rolling out a set of multiple heuristic policies rather than a single policy. In particular, the parallel rollout approach aims at the class of problems where we have multiple heuristic policies available such that each policy performs near-optimal for a different set of system paths. Parallel rollout automatically combines the given multiple policies to create a new policy that adapts to the different system paths and improves the performance of each policy in the set. We formally prove this claim for two criteria: total expected reward and infinite horizon discounted reward. The parallel rollout approach also resolves the key issue of selecting which policy to roll out among multiple heuristic policies whose performances cannot be predicted in advance. We present two example problems to illustrate the effectiveness of the parallel rollout approach: a buffer management problem and a multiclass scheduling problem.  相似文献   

8.
This paper derives two canonical state space forms (i.e., the observer canonical form and the observability canonical form) from multiple-input multiple-output systems described by difference equations. The state space model is expressed by the first-order difference equation and is equivalent to the input–output representation. More specifically, by setting the different state variables, the difference equations or the input–output representations can be transformed into two observable canonical forms and the canonical state space model can be also transformed into the difference equations. Finally, two examples are given.  相似文献   

9.
洪晔  边信黔 《计算机仿真》2007,24(6):146-149
自治式水下机器人在复杂海洋环境航行时要求寻找一条从给定起始点到终止点的较优的运动路径,安全、无碰撞地绕过所有的障碍物.提出了一种基于部分可观察马尔可夫决策过程,并结合预测障碍物运动的全局路径规划新方法; 给出了部分可观马尔可夫决策的数学模型;建立了树状的分层部分可观马尔可夫决策模型,并在路径规划中应用;提出了短期预测和长期预测两种针对水下障碍物运动轨迹预测的方法;最后通过仿真实验对AUV的全局路径规划能力进行了仿真验证,为今后的实艇试验打下了很好的基础.  相似文献   

10.
Stability analysis of linear systems with time-varying delay is investigated. In order to highlight the relations between the variation of the delay and the states, redundant equations are introduced to construct a new modelling of the delay system. New types of Lyapunov–Krasovskii functionals are then proposed allowing to reduce the conservatism of the stability criterion. Delay-dependent stability conditions are then formulated in terms of linear matrix inequalities. Finally, several examples show the effectiveness of the proposed methodology.  相似文献   

11.
Matrix exponential distributions and rational arrival processes have been proposed as an extension to pure Markov models. The paper presents an approach where these process types are used to describe the timing behavior in quantitative models like queueing networks, stochastic Petri nets or stochastic automata networks. The resulting stochastic process, which is called a rational process, is defined and it is shown that the matrix governing the behavior of the process has a structured representation which allows one to represent the matrix in a very compact form.  相似文献   

12.
13.
14.
使用马氏决策过程研究了概率离散事件系统的最优控制问题.首先,通过引入费用函数、目标函数以及最优函数的定义,建立了可以确定最优监控器的最优方程.之后,又通过此最优方程获得了给定语言的极大可控、∈-包含闭语言.最后给出了获得最优费用与最优监控器的算法.  相似文献   

15.
The purpose of this paper is to analyze the sensitivity problems in metal forming of rigid-visco-poroplastic materials. A repressing powder forging process is analyzed. Parameter sensitivity, material derivative, and control volume approaches to shape sensitivity analysis are presented, analyzed, and compared. Discretization of the continuum expressions is presented. The numerical solutions for parameter sensitivity in forging problems have been described. Numerical examples concerning simple compression test of cylindrical porous specimen are presented.  相似文献   

16.
Opacity is a generic security property, that has been defined on (non-probabilistic) transition systems and later on Markov chains with labels. For a secret predicate, given as a subset of runs, and a function describing the view of an external observer, the value of interest for opacity is a measure of the set of runs disclosing the secret. We extend this definition to the richer framework of Markov decision processes, where non-deterministic choice is combined with probabilistic transitions, and we study related decidability problems with partial or complete observation hypotheses for the schedulers. We prove that all questions are decidable with complete observation and ω-regular secrets. With partial observation, we prove that all quantitative questions are undecidable but the question whether a system is almost surely non-opaque becomes decidable for a restricted class of ω-regular secrets, as well as for all ω-regular secrets under finite-memory schedulers.  相似文献   

17.
18.
Distributed and concurrent object-oriented systems are difficult to analyze due to the complexity of their concurrency, communication, and synchronization mechanisms. Rather than performing analysis at the level of code in, e.g., Java or C++, we consider the analysis of such systems at the level of an abstract, executable modeling language. This language, based on concurrent objects communicating by asynchronous method calls, avoids some difficulties of mainstream object-oriented programming languages related to compositionality and aliasing. To facilitate system analysis, compositional verification systems are needed, which allow components to be analyzed independently of their environment. In this paper, a proof system for partial correctness reasoning is established based on communication histories and class invariants. A particular feature of our approach is that the alphabets of different objects are completely disjoint. Compared to related work, this allows the formulation of a much simpler Hoare-style proof system and reduces reasoning complexity by significantly simplifying formulas in terms of the number of needed quantifiers. The soundness and relative completeness of this proof system are shown using a transformational approach from a sequential language with a non-deterministic assignment operator.  相似文献   

19.
20.
Weighted Markov decision processes (MDPs) have long been used to model quantitative aspects of systems in the presence of uncertainty. However, much of the literature on such MDPs takes a monolithic approach, by modelling a system as a particular MDP; properties of the system are then inferred by analysis of that particular MDP. In contrast in this paper we develop compositional methods for reasoning about weighted MDPs, as a possible basis for compositional reasoning about their quantitative behaviour. In particular we approach these systems from a process algebraic point of view. For these we define a coinductive simulation-based behavioural preorder which is compositional in the sense that it is preserved by structural operators for constructing weighted MDPs from components.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号