首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 640 毫秒
1.
Basic Ideas for Event-Based Optimization of Markov Systems   总被引:5,自引:0,他引:5  
The goal of this paper is two-fold: First, we present a sensitivity point of view on the optimization of Markov systems. We show that Markov decision processes (MDPs) and the policy-gradient approach, or perturbation analysis (PA), can be derived easily from two fundamental sensitivity formulas, and such formulas can be flexibly constructed, by first principles, with performance potentials as building blocks. Second, with this sensitivity view we propose an event-based optimization approach, including the event-based sensitivity analysis and event-based policy iteration. This approach utilizes the special feature of a system characterized by events and illustrates how the potentials can be aggregated using the special feature and how the aggregated potential can be used in policy iteration. Compared with the traditional MDP approach, the event-based approach has its advantages: the number of aggregated potentials may scale to the system size despite that the number of states grows exponentially in the system size, this reduces the policy space and saves computation; the approach does not require actions at different states to be independent; and it utilizes the special feature of a system and does not need to know the exact transition probability matrix. The main ideas of the approach are illustrated by an admission control problem.Supported in part by a grant from Hong Kong UGC.  相似文献   

2.
王国良  秦奋 《控制与决策》2016,31(7):1265-1271

针对Markov 系统矩阵参数未知的实际情况, 提出一种基于状态反馈控制与自适应控制相结合的控制方法. 基于线性矩阵不等式技术给出相应控制器参数的求解条件. 与现有大多数自适应控制方法相比, 所提方法不仅使估计误差几乎处处有界, 而且原系统的系统状态几乎处处渐近稳定, 具有较好的收敛特性. 在所得结果的基础上, 进一步讨论了转移速率部分未知时的相关控制问题. 数值算例验证了所提出的设计方法的有效性.

  相似文献   

3.
We analyze the problem of modeling an observed impulse response by means of a finite-dimensional, linear, time-invariant system. Our approach differs from classical realization theory in the following respects. The modeling problem is split in two steps, namely, identification for determining a model for the observations, and realization for determining parameters which describe the model. Systems are considered as sets of time series, not as input-output maps. In particular, the partitioning of variables into inputs and outputs need not be known, and it is not required that there exist a causal relationship between inputs and outputs. Further, we make no assumptions concerning initial conditions, which in particular may be nonzero. Determination of initial conditions is part of the modeling problem. A final significant distinction from classical realization theory is that the systems need not be controllable.We characterize the class of systems which can be identified from impulse response measurements. Necessary and sufficient conditions are formulated in terms of state-space realizations. It turns out that noncontrollable systems are also identifiable. For causal systems, the condition is that the state transition matrix, restricted to the noncontrollable states, has sufficiently small cyclic index. For noncausal systems, the condition is expressed in terms of the rank of the (singular) state evolution equation.  相似文献   

4.
In the verification of reactive systems with nondeterministic densely valued temporal parameters, the state-space can be covered through equivalence classes, each composed of a discrete logical location and a dense variety of clock valuations encoded as a difference bounds matrix (DBM). The reachability relation among such classes enables qualitative verification of properties pertaining events ordering and stimulus/response deadlines, but it does not provide any measure of probability for feasible behaviors. We extend DBM equivalence classes with a density-function which provides a measure for the probability of individual states. To this end, we extend time Petri nets by associating a probability density-function to the static firing interval of each nondeterministic transition. We then explain how this stochastic information induces a probability distribution for the states contained within a DBM class and how this probability evolves in the enumeration of the reachability relation among classes. This enables the construction of a stochastic transition system which supports correctness verification based on the theory of TPNs, provides a measure of probability for each feasible run, enables steady-state analysis based on Markov renewal theory. In so doing, we provide a means to identify feasible behaviors and to associate them with a measure of probability in models with multiple concurrent generally distributed nondeterministic timers.  相似文献   

5.
The optimization problems of Markov control processes (MCPs) with exact knowledge of system parameters, in the form of transition probabilities or infinitesimal transition rates, can be solved by using the concept of Markov performance potential which plays an important role in the sensitivity analysis of MCPs. In this paper, by using an equivalent infinitesimal generator, we first introduce a definition of discounted Poisson equations for semi-Markov control processes (SMCPs), which is similar to that for MCPs, and the performance potentials of SMCPs are defined as solution of the equation. Some related optimization techniques based on performance potentials for MCPs may be extended to the optimization of SMCPs if the system parameters are known with certainty. Unfortunately, exact values of the distributions of the sojourn times at some states or the transition probabilities of the embedded Markov chain for a large-scale SMCP are generally difficult or impossible to obtain, which leads to the uncertainty of the semi-Markov kernel, and thereby to the uncertainty of equivalent infinitesimal transition rates. Similar to the optimization of uncertain MCPs, a potential-based policy iteration method is proposed in this work to search for the optimal robust control policy for SMCPs with uncertain infinitesimal transition rates that are represented as compact sets. In addition, convergence of the algorithm is discussed.  相似文献   

6.
时滞是许多工业系统的固有特性,会导致系统控制性能的下降,甚至影响系统稳定,而在实际系统中,有限时间系统的特性更值得关注。针对上述情况,对一类具有时滞的马尔可夫跳变系统有限时间控制器设计的问题进行了研究。把转移概率完全已知的条件放宽至部分未知的更一般情形,采用自由权重的方法,保证所得的线性矩阵不等式具有更小的保守性。首先,给出马尔科夫跳变系统有限时间有界性、有限时间 H无穷有界性的判定准则。然后,通过对线性矩阵不等式(LMIs)求解,获得状态观测器和状态反馈控制器的增益矩阵。最后,仿真实例验证所提算法的有效性。  相似文献   

7.
We consider the optimization of queueing systems with service rates depending on system states. The optimization criterion is the long-run customer-average performance, which is an important performance metric, different from the traditional time-average performance. We first establish, with perturbation analysis, a difference equation of the customer-average performance in closed networks with exponentially distributed service times and state-dependent service rates. Then we propose a policy iteration optimization algorithm based on this difference equation. This algorithm can be implemented on-line with a single sample path and does not require knowing the routing probabilities of queueing systems. Finally, we give numerical experiments which demonstrate the efficiency of our algorithm. This paper gives a new direction to efficiently optimize the “customer-centric” performance in queueing systems.  相似文献   

8.
In this paper,we study the robust control for uncertain Markov jump linear singularly perturbed systems(MJLSPS),whose transition probability matrix is unknown.An improved heuris- tic algorithm is proposed to solve the nonlinear matrix inequalities.The results of this paper can apply not only to standard,but also to nonstandard MJLSPS.Moreover,the proposed approach is independent of the perturbation parameter and therefore avoids the ill-conditioned numerical prob- lems.  相似文献   

9.
10.
In this paper, we design a controller for stabilising a control system. The technique used for designing the controller includes a linear regulator and an asymptotical observer which form the controller. The linear regulator designed is a feedback of estimated states and also it must minimise a quadratic performance index. The gain matrix of optimal feedback is obtained by solving the Riccati equation, whilst the gain observer matrix is computed by making use of symmetrical systems properties. The properties of symmetrical systems allow us to find the optimal gain matrix of the observer without solving the dual Riccati equation, we only need to compute the matrices of controllability and observability. Having calculated the gain matrices of regulator and of observer, we proceeded to compute the transfer function of the observer-based controller.  相似文献   

11.
In this paper, we study the robust control for uncertain Markov jump linear singularly perturbed systems (MJLSPS), whose transition probability matrix is unknown. An improved heuristic algorithm is proposed to solve the nonlinear matrix inequalities. The results of this paper can apply not only to standard, but also to nonstandard MJLSPS. Moreover, the proposed approach is independent of the perturbation parameter and therefore avoids the ill-conditioned numerical problems.  相似文献   

12.
13.
Discrete-event systems modeled as continuous-time Markov processes and characterized by some integer-valued parameter are considered. The problem addressed is that of estimating performance sensitivities with respect to this parameter by directly observing a single sample path of the system. The approach is based on transforming the nominal Markov chain into a reduced augmented chain, the stationary-state probabilities which can be easily combined to obtain stationary-state probability sensitivities with respect to the given parameter. Under certain conditions, the reduced augmented chain state transitions are observable with respect to the state transitions of the system itself, and no knowledge of the nominal Markov-chain state of the transition rates is required. Applications for some queueing systems are included. The approach incorporates estimation of unknown transition rates when needed and is extended to real-valued parameters  相似文献   

14.
研究上述系统时: 1) 利用了非线性的概率分布信息; 2) 利用了转移概率中已知部分和未知部分的关系. 利用李雅普诺夫泛函方法和线性矩阵不等式方法, 本文得到了使得系统随机稳定的充分条件并得到了相应的反馈控制增益. 文中最后给出的例子表明了所建立模型和分析方法的有效性.  相似文献   

15.
There are two commonly used analytical reliability analysis methods: linear approximation - first-order reliability method (FORM), and quadratic approximation - second-order reliability method (SORM), of the performance function. The reliability analysis using FORM could be acceptable in accuracy for mildly nonlinear performance functions, whereas the reliability analysis using SORM may be necessary for accuracy of nonlinear and multi-dimensional performance functions. Even though the reliability analysis using SORM may be accurate, it is not as much used for probability of failure calculation since SORM requires the second-order sensitivities. Moreover, the SORM-based inverse reliability analysis is rather difficult to develop.This paper proposes an inverse reliability analysis method that can be used to obtain accurate probability of failure calculation without requiring the second-order sensitivities for reliability-based design optimization (RBDO) of nonlinear and multi-dimensional systems. For the inverse reliability analysis, the most probable point (MPP)-based dimension reduction method (DRM) is developed. Since the FORM-based reliability index (β) is inaccurate for the MPP search of the nonlinear performance function, a three-step computational procedure is proposed to improve accuracy of the inverse reliability analysis: probability of failure calculation using constraint shift, reliability index update, and MPP update. Using the three steps, a new DRM-based MPP is obtained, which estimates the probability of failure of the performance function more accurately than FORM and more efficiently than SORM. The DRM-based MPP is then used for the next design iteration of RBDO to obtain an accurate optimum design even for nonlinear and/or multi-dimensional system. Since the DRM-based RBDO requires more function evaluations, the enriched performance measure approach (PMA+) with new tolerances for constraint activeness and reduced rotation matrix is used to reduce the number of function evaluations.  相似文献   

16.
In this paper, a methodology for designing an output feedback controller for discrete‐time networked control systems has been considered. More precisely, network‐induced delays between the sensor and the controller is modelled by a Markov chain with transition probabilities which are not assumed to be fully known. The systems parameter uncertainties are assumed to be norm‐bounded and possibly time‐varying. To the best of the authors knowledge, the problem of designing a partially mode delay‐dependent output feedback controller for NCSs with partially known transition probability matrix has not been investigated in the literature. Based on the Lyapunov‐Krasovskii functional approach, sufficient conditions for the existence of a robust partially mode delay‐dependent output feedback controller are given in terms of bilinear matrix inequalities which can be solved using a cone complementarity linearization algorithm. The proposed design methodology differs from the existing design methodologies in that dynamic output feedback controllers are parameterized by both modes and transition probabilities, as opposed to the existing design approaches which parameterize controllers by modes only. The results obtained reduce to the existing results on fully known transition matrices when transition probabilities are fully known. It is shown that the proposed methodology can be applied to real world systems. The proposed design methodology is verified by using a DC servo motor system where the plant and the controller are connected via a cellular network with partially known transition probability matrix.  相似文献   

17.
This paper deals with the problem of robust H control for a class of discrete‐time Markovian jump systems subject to both actuator saturation and incomplete knowledge of transition probability. Different from the previous results where the transition probability is completely known, a more general situation where only partial information on the exact values of elements in transition probability matrix is considered. By introducing some free parameters to express the relationship for the known and the unknown elements of transition probability matrix in stability analysis, a criterion is established to guarantee the stochastic stability of the closed‐loop system as well as an H performance index. The concept of domain of attraction in mean square sense is used to analyze the closed‐loop stability, and the mode‐dependent H state‐feedback controller is designed. It is shown that, even in the absence of actuator saturation, the obtained result is less conservative than the existing one. A numerical example is provided to illustrate the effectiveness of the proposed method. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

18.
This paper proposes a robust stochastic stability analysis approach with partly unknown transition probability by considering the wind speed prediction error in power system. Firstly, taking this prediction error into account, based on Markov modeling theory, the stochastic dynamic model of wind power system with uncertain transition probability is developed. Secondly, according to the stochastic stability theory of Markov jump system, the transition probability of wind power system mode is divided into three cases: fully known, only known upper and lower bounds, and completely unknown. Then, by using linear matrix inequality (LMI) technology, a robust stochastic stability criterion with disturbance attenuation is obtained. Finally, test results show that the proposed analysis approach does not need to obtain the trajectory of the actual system operation parameters, and has the advantages of high computational efficiency.  相似文献   

19.
Perturbation analysis (PA) applies a dynamic point of view to the sample paths of stochastic systems; the realization factor, one of the main concepts of PA, measures the final effect of a perturbation on system performance and provides a novel approach in obtaining performance sensitivities. In this paper, we solve analytically the set of equations for realization factors of a two-server cyclic network. We prove an invariance property of the performance sensitivity for Norton's aggregation. Using the results, we derive closed-form formulae for the derivatives of performance measures in a closed queueing network with load-dependent exponential servers. The performance measures have two general forms: customer average and time average. In contrast with the usual approach based on product-form solutions, our results provide additional insights into the performance sensitivity of closed queueing networks and have immediate applications to problems of optimal control. The general formulae are expressed in terms of Buzen's algorithm with a computational complexity comparable to that of the formulae obtained by directly taking the derivatives of the product-form solutions.  相似文献   

20.
In this paper, we are concerned with a new control problem for uncertain discrete-time stochastic systems with missing measurements. The parameter uncertainties are allowed to be norm-bounded and enter into the state matrix. The system measurements may be unavailable (i.e., missing data) at any sample time, and the probability of the occurrence of missing data is assumed to be known. The purpose of this problem is to design an output feedback controller such that, for all admissible parameter uncertainties and all possible incomplete observations, the system state of the closed-loop system is mean square bounded, and the steady-state variance of each state is not more than the individual prescribed upper bound. We show that the addressed problem can be solved by means of algebraic matrix inequalities. The explicit expression of the desired robust controllers is derived in terms of some free parameters, which may be exploited to achieve further performance requirements. An illustrative numerical example is provided to demonstrate the usefulness and flexibility of the proposed design approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号