首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this work, probabilistic reachability over a finite horizon is investigated for a class of discrete time stochastic hybrid systems with control inputs. A suitable embedding of the reachability problem in a stochastic control framework reveals that it is amenable to two complementary interpretations, leading to dual algorithms for reachability computations. In particular, the set of initial conditions providing a certain probabilistic guarantee that the system will keep evolving within a desired ‘safe’ region of the state space is characterized in terms of a value function, and ‘maximally safe’ Markov policies are determined via dynamic programming. These results are of interest not only for safety analysis and design, but also for solving those regulation and stabilization problems that can be reinterpreted as safety problems. The temperature regulation problem presented in the paper as a case study is one such case.  相似文献   

2.
This paper presents a methodology for safety verification of continuous and hybrid systems in the worst-case and stochastic settings. In the worst-case setting, a function of state termed barrier certificate is used to certify that all trajectories of the system starting from a given initial set do not enter an unsafe region. No explicit computation of reachable sets is required in the construction of barrier certificates, which makes it possible to handle nonlinearity, uncertainty, and constraints directly within this framework. In the stochastic setting, our method computes an upper bound on the probability that a trajectory of the system reaches the unsafe set, a bound whose validity is proven by the existence of a barrier certificate. For polynomial systems, barrier certificates can be constructed using convex optimization, and hence the method is computationally tractable. Some examples are provided to illustrate the use of the method.  相似文献   

3.
《Automatica》2014,50(11):2822-2834
We study the quadratic control of a class of stochastic hybrid systems with linear continuous dynamics for which the lengths of time that the system stays in each mode are independent random variables with given probability distribution functions. We derive a condition for finding the optimal feedback policy that minimizes a discounted infinite horizon cost. We show that the optimal cost is the solution to a set of differential equations with unknown boundary conditions. Furthermore, we provide a recursive algorithm for computing the optimal cost and the optimal feedback policy. The applicability of our result is illustrated through a numerical example, motivated by stochastic gene regulation in biology.  相似文献   

4.
This paper deals with the infinite horizon linear quadratic(LQ)differential games for discrete-time stochastic systems with both state and control dependent noise.The Popov-Belevitch-Hautus(PBH)criteria for exact observability and exact detectability of discrete-time stochastic systems are presented.By means of them,we give the optimal strategies (Nash equilibrium strategies)and the optimal cost values for infinite horizon stochastic differential games.It indicates that the infinite horizon LQ stochastic differential games are associated with four coupled matrix-valued equations.Furthermore, an iterative algorithm is proposed to solve the four coupled equations.Finally,an example is given to demonstrate our results.  相似文献   

5.
Discrete-time stochastic systems employing possibly discontinuous state-feedback control laws are addressed. Allowing discontinuous feedbacks is fundamental for stochastic systems regulated, for instance, by optimization-based control laws. We introduce generalized random solutions for discontinuous stochastic systems to guarantee the existence of solutions and to generate enough solutions to get an accurate picture of robustness with respect to strictly causal perturbations. Under basic regularity conditions, the existence of a continuous stochastic Lyapunov function is sufficient to establish that asymptotic stability in probability for the closed-loop system is robust to sufficiently small, state-dependent, strictly causal, worst-case perturbations. Robustness of a weaker stochastic stability property called recurrence is also shown in a global sense in the case of state-dependent perturbations, and in a semiglobal practical sense in the case of persistent perturbations. An example shows that a continuous stochastic Lyapunov function is not sufficient for robustness to arbitrarily small worst-case disturbances that are not strictly causal. Our positive results are also illustrated by examples.  相似文献   

6.
We consider in this paper a continuous-time stochastic hybrid control system with a finite time horizon. The objective is to minimize a linear function of the expected state trajectory. The state evolves according to a linear dynamics. However, the parameters of the state evolution equation may change at discrete times according to a controlled Markov chain which has finite state and action spaces. We use a procedure similar in form to the maximum principle; this determines a control strategy which is asymptotically optimal as the number of transitions during the finite time horizon grows to infinity.  相似文献   

7.
In this paper, we consider minimax games for stochastic uncertain systems with the pay-off being a nonlinear functional of the uncertain measure where the uncertainty is measured in terms of relative entropy between the uncertain and the nominal measure. The maximizing player is the uncertain measure, while the minimizer is the control which induces a nominal measure. Existence and uniqueness of minimax solutions are derived on suitable spaces of measures. Several examples are presented illustrating the results. Subsequently, the results are also applied to controlled stochastic differential equations on Hilbert spaces. Based on infinite dimensional extension of Girsanov’s measure transformation, martingale solutions are used in establishing existence and uniqueness of minimax strategies. Moreover, some basic properties of the relative entropy of measures on infinite dimensional spaces are presented and then applied to uncertain systems described by a stochastic differential inclusion on Hilbert space. An explicit expression for the worst case measure representing the maximizing player (adversary) is found.  相似文献   

8.
We present automatic verification techniques for the modelling and analysis of probabilistic systems that incorporate competitive behaviour. These systems are modelled as turn-based stochastic multi-player games, in which the players can either collaborate or compete in order to achieve a particular goal. We define a temporal logic called rPATL for expressing quantitative properties of stochastic multi-player games. This logic allows us to reason about the collective ability of a set of players to achieve a goal relating to the probability of an event’s occurrence or the expected amount of cost/reward accumulated. We give an algorithm for verifying properties expressed in this logic and implement the techniques in a probabilistic model checker, as an extension of the PRISM tool. We demonstrate the applicability and efficiency of our methods by deploying them to analyse and detect potential weaknesses in a variety of large case studies, including algorithms for energy management in Microgrids and collective decision making for autonomous systems.  相似文献   

9.
In this paper, we study algorithmic problems for quantitative models that are motivated by the applications in modeling embedded systems. We consider two-player games played on a weighted graph with mean-payoff objective and with energy constraints. We present a new pseudopolynomial algorithm for solving such games, improving the best known worst-case complexity for pseudopolynomial mean-payoff algorithms. Our algorithm can also be combined with the procedure by Andersson and Vorobyov to obtain a randomized algorithm with currently the best expected time complexity. The proposed solution relies on a simple fixpoint iteration to solve the log-space equivalent problem of deciding the winner of energy games. Our results imply also that energy games and mean-payoff games can be reduced to safety games in pseudopolynomial time.  相似文献   

10.
We present a functional framework for automated Bayesian and worst-case mechanism design, based on a two-stage game model of strategic interaction between the designer and the mechanism participants. At the core of our framework is a black-box optimization algorithm which guides the process of evaluating candidate mechanisms. We apply the approach to several classes of two-player infinite games of incomplete information, producing optimal or nearly optimal mechanisms using various objective functions. By comparing our results with known optimal mechanisms, and in some cases improving on the best known mechanisms, we provide evidence that ours is a promising approach to parametrized mechanism design for infinite Bayesian games.  相似文献   

11.
This paper is devoted to the estimation of stochastic context-free grammars (SCFGs) and their use as language models. Classical estimation algorithms, together with new ones that consider a certain subset of derivations in the estimation process, are presented in a unified framework. This set of derivations is chosen according to both structural and statistical criteria. The estimated SCFGs have been used in a new hybrid language model to combine both a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG together with a word distribution into categories, which is defined to represent the long-term relations between these categories. We describe methods for learning these stochastic models for complex tasks, and we present an algorithm for computing the word transition probability using this hybrid language model. Finally, experiments on the UPenn Treebank corpus show significant improvements in the test set perplexity with regard to the classical word trigram models.  相似文献   

12.
This paper deals with a class of piecewise determinstic control systems for which the optimal control can be approximated through the use of an optimization-by-simulation approach. The feedback control law is restricted to belong to an a priori fixed class of feedback control laws depending on a (small) finite set of parameters. Under some general conditions developed in this paper, infinitesimal perturbation analysis (IPA) can be used to estimate the gradient of the objective function with respect to these parameters for finite horizon simulation and the consistency of the IPA estimators, as the simulation length goes to infinity, is assured. Also, the parameters can be optimized through a stochastic approximation (SA) algorithm combined with IPA. We prove that in this context, under appropriate conditions, such an approach converges towards the optimum.  相似文献   

13.
We consider robust stochastic large population games for coupled Markov jump linear systems (MJLSs). The N agents’ individual MJLSs are governed by different infinitesimal generators, and are affected not only by the control input but also by an individual disturbance (or adversarial) input. The mean field term, representing the average behaviour of N agents, is included in the individual worst-case cost function to capture coupling effects among agents. To circumvent the computational complexity and analyse the worst-case effect of the disturbance, we use robust mean field game theory to design low-complexity robust decentralised controllers and to characterise the associated worst-case disturbance. We show that with the individual robust decentralised controller and the corresponding worst-case disturbance, which constitute a saddle-point solution to a generic stochastic differential game for MJLSs, the actual mean field behaviour can be approximated by a deterministic function which is a fixed-point solution to the constructed mean field system. We further show that the closed-loop system is uniformly stable independent of N, and an approximate optimality can be obtained in the sense of ε-Nash equilibrium, where ε can be taken to be arbitrarily close to zero as N becomes sufficiently large. A numerical example is included to illustrate the results.  相似文献   

14.
将Q-learning从单智能体框架上扩展到非合作的多智能体框架上,建立了在一般和随 机对策框架下的多智能体理论框架和学习算法,提出了以Nash平衡点作为学习目标.给出了对 策结构的约束条件,并证明了在此约束条件下算法的收敛性,对多智能体系统的研究与应用有 重要意义.  相似文献   

15.
In this paper, we discuss infinite-horizon soft-constrained stochastic Nash games involving state-dependent noise in weakly coupled large-scale systems. First, we formulate linear quadratic differential games in which robustness is attained against model uncertainty. It is noteworthy that this is the first time conditions for the existence of robust equilibria have been derived based on the solutions of sets of cross-coupled stochastic algebraic Riccati equations (CSAREs). After establishing an asymptotic structure with positive definiteness for CSAREs solutions, we derive a recursive algorithm by means of Newton’s method so that it can be used to obtain solutions for CSAREs. As another important feature, we propose a high-order approximate Nash strategy based on iterative solutions. Finally, we provide a numerical example to verify the efficiency of the proposed algorithms.  相似文献   

16.
A general discrete-time stochastic linear model with a quadratic objective function is analyzed with regard to its neutrality and a rarely-discussed property called quadraticity. Such a problem having quadraticity is relatively easily solved, and most variations of the linear quadratic control problem for which exact solutions are known possess this property. A generally less tractable set of problems is that consisting of nonneutral models, those in which it is possible to learn about system unknowns through experimentation. It is shown that quadraticity and neutrality apparently neither imply nor preclude one another within the class of linear quadratic models.  相似文献   

17.
The author formulates and solves a dynamic stochastic optimization problem of a nonstandard type, whose optimal solution features active learning. The proof of optimality and the derivation of the corresponding control policies is an indirect one, which relates the original single-person optimization problem to a sequence of nested zero-sum stochastic games. Existence of saddle points for these games implies the existence of optimal policies for the original control problem, which, in turn, can be obtained from the solution of a nonlinear deterministic, optimal control problem. The author also studies the problem of existence of stationary optimal policies when the time horizon is infinite and the objective function is discounted  相似文献   

18.
Sean Summers  John Lygeros 《Automatica》2010,46(12):1951-1961
We present a dynamic programming based solution to a probabilistic reach-avoid problem for a controlled discrete time stochastic hybrid system. We address two distinct interpretations of the reach-avoid problem via stochastic optimal control. In the first case, a sum-multiplicative cost function is introduced along with a corresponding dynamic recursion which quantifies the probability of hitting a target set at some point during a finite time horizon, while avoiding an unsafe set during each time step preceding the target hitting time. In the second case, we introduce a multiplicative cost function and a dynamic recursion which quantifies the probability of hitting a target set at the terminal time, while avoiding an unsafe set during the preceding time steps. In each case, optimal reach while avoid control policies are derived as the solution to an optimal control problem via dynamic programming. Computational examples motivated by two practical problems in the management of fisheries and finance are provided.  相似文献   

19.
This article focuses on the adaptive tracking control problem for a class of interconnected nonlinear stochastic systems under full‐state constraints based on the hybrid threshold strategy. Different from the existing works, we propose a novel pre‐constrained tracking control algorithm to deal with the full‐state constraint problem. First, a novel nonlinear transformation function and a new coordinate transformation are developed to constrain state variables, which can directly cope with asymmetric state constraints. Second, the hybrid threshold strategy is constructed to provide a reasonable way in balancing system performance and communication constraints. By the use of dynamic surface control technique and neural network approximate technique, a smooth pre‐constrained tracking controller with adaptive laws is designed for the interconnected nonlinear stochastic systems. Moreover, based on the Lyapunov stability theory, it is proved that all state variables are successfully pre‐constrained within asymmetric boundaries. Finally, a simulation example is presented to verify the effectiveness of proposed control algorithm.  相似文献   

20.
This paper addresses the random time-delays and packet losses issues of networked control systems (NCS) within the framework of jump linear systems with mode-dependent time-delays. A new delay-dependent condition on the stochastic stability is proposed by a new stochastic Lyapunov-Krasovskii functional. The condition is formulated as a set of coupled linear matrix inequalities (LMIs). As an example to verify the proposed method, an inverted-pendulum system with network is considered. The simulation results demonstrate the effectiveness of the method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号