期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Average cost optimal policies for Markov control processes with Borel state space and unbounded costs

Onsimo Hernndez-Lerma 《Systems & Control Letters》1990,15(4)

We show the existence of average cost optimal stationary policies for Markov control processes with Borel state space and unbounded costs per stage, under a set of assumptions recently introduced by L.I. Sennott (1989) for control processes with countable state space and finite control sets. 相似文献

2.

Markovianity of a subset of components of a Markov process

P. I. Kitsul R. S. Liptser A. P. Serebrovski 《Systems & Control Letters》2002,46(4)

We consider the necessary and sufficient conditions for a group of the components of a stationary vector Gaussian Markov process to possess Markov property. The representation by a linear Itô stochastic differential equation is also given. 相似文献

3.

H2-control and the separation principle for discrete-time jump systems with the Markov chain in a general state space

Danilo Zucolli Figueiredo 《International journal of systems science》2017,48(13):2728-2741

相似文献

4.

带Markov跳的离散时间随机控制系统的最大值原理

蔺香运王鑫瑞张维海《控制理论与应用》2024,41(5):895-904

本文研究一类同时含有Markov跳过程和乘性噪声的离散时间非线性随机系统的最优控制问题, 给出并证明了相应的最大值原理. 首先, 利用条件期望的平滑性, 通过引入具有适应解的倒向随机差分方程, 给出了带有线性差分方程约束的线性泛函的表示形式, 并利用Riesz定理证明其唯一性. 其次, 对带Markov跳的非线性随机控制系统, 利用针状变分法, 对状态方程进行一阶变分, 获得其变分所满足的线性差分方程. 然后, 在引入Hamilton函数的基础上, 通过一对由倒向随机差分方程刻画的伴随方程, 给出并证明了带有Markov跳的离散时间非线性随机最优控制问题的最大值原理, 并给出该最优控制问题的一个充分条件和相应的Hamilton-Jacobi-Bellman方程. 最后, 通过一个实际例子说明了所提理论的实用性和可行性. 相似文献

5.

Backward Stochastic H2/H∞ Control with Random Jumps

下载免费PDF全文

Qixia Zhang 《Asian journal of control》2014,16(4):1238-1244

This paper is concerned with H₂/H_∞ control of a new class of stochastic systems. The most distinguishing feature, compared with the existing literature, is that the systems are described by backward stochastic differential equations (BSDEs) with Brownian motion and random jumps. It is shown that the backward stochastic H₂/H_∞ control under consideration is associated with the of the corresponding uncontrolled backward stochastic perturbed system. A necessary and sufficient condition for the existence of a unique solution to the control problem under consideration is derived. The resulting solution is characterized by the solution of an uncontrolled forward backward stochastic differential equation (FBSDE) with Brownian motion and random jumps. When the coefficients are all deterministic, the equivalent linear feedback solution involves a pair of Riccati‐type equations and an uncontrolled BSDE. In addition an uncontrolled forward stochastic differential equation (SDE) is given. 相似文献

6.

An improved method for bounding stationary measures of finite Markov processes

Peter 《Performance Evaluation》2005,62(1-4):349-365

A new method to compute bounds on stationary results of finite Markov processes in discrete or continuous time is introduced. The method extends previously published approaches using polyhedra of eigenvectors for stochastic matrices with a known lower and upper bound of their elements. Known techniques compute bounds for the elements of the stationary vector with respect to the lower bounds of the matrix elements and another set of bounds with respect to the upper bounds of matrix elements. The resulting bounds are usually not sharp, if lower and upper bounds for the elements are known. The new approach combines lower and upper bounds resulting in sharp bounds which are often much tighter than bounds computed using only one bounding value for the matrix elements. 相似文献

7.

Low dimensional filters for a class of finite state estimation problems with Poisson observations

Steven I. Marcus 《Systems & Control Letters》1982,1(4)

相似文献

8.

Probabilistic opacity for Markov decision processes

Béatrice Bérard Krishnendu Chatterjee Nathalie Sznajder 《Information Processing Letters》2015

Opacity is a generic security property, that has been defined on (non-probabilistic) transition systems and later on Markov chains with labels. For a secret predicate, given as a subset of runs, and a function describing the view of an external observer, the value of interest for opacity is a measure of the set of runs disclosing the secret. We extend this definition to the richer framework of Markov decision processes, where non-deterministic choice is combined with probabilistic transitions, and we study related decidability problems with partial or complete observation hypotheses for the schedulers. We prove that all questions are decidable with complete observation and ω-regular secrets. With partial observation, we prove that all quantitative questions are undecidable but the question whether a system is almost surely non-opaque becomes decidable for a restricted class of ω-regular secrets, as well as for all ω-regular secrets under finite-memory schedulers. 相似文献

9.

Stability of solutions of dynamic systems with randomly structured aftereffect

I. V. Vernigora 《Cybernetics and Systems Analysis》2006,42(2):188-194

The relation is established between the asymptotic stochastic stability of a linear functional differential equation and exponential stability of the trivial solution to this equation. The direct and inverse Lyapunov theorems on the stability of linear differential equations are proved. __________ Translated from Kibernetika i Sistemnyi Analiz, No. 2, pp. 31–38, March–April 2006. 相似文献

10.

Linear‐Quadratic Optimal Control Problem for Partially Observed Forward‐Backward Stochastic Differential Equations of Mean‐Field Type

下载免费PDF全文

Heping Ma Bin Liu 《Asian journal of control》2016,18(6):2146-2157

This paper is concerned with the linear‐quadratic optimal control problem for partially observed forward‐backward stochastic differential equations (FBSDEs) of mean‐field type. Based on the classical spike variational method, backward separation approach as well as filtering technique, we first derive the necessary and sufficient conditions of the optimal control problem with the non‐convex domain. Nextly, by means of the decoupling technique, we obtain two Riccati equations, which are uniquely solvable under certain conditions. Also, the optimal cost functional is represented by the solutions of the Riccati equations for the special case. 相似文献

11.

Optimal decisions for continuous time Markov decision processes over finite planning horizons

《Computers & Operations Research》2017

The computation of ϵ-optimal policies for continuous time Markov decision processes (CTMDPs) over finite time intervals is a sophisticated problem because the optimal policy may change at arbitrary times. Numerical algorithms based on time discretization or uniformization have been proposed for the computation of optimal policies. The uniformization based algorithm has shown to be more reliable and often also more efficient but is currently only available for processes where the gain or reward does not depend on the decision taken in a state. In this paper, we present two new uniformization based algorithms for computing ϵ-optimal policies for CTMDPs with decision dependent rewards over a finite time horizon. Due to a new and tighter upper bound the newly proposed algorithms cannot only be applied for decision dependent rewards, they also outperform the available approach for rewards that do not depend on the decision. In particular for models where the policy only rarely changes, optimal policies can be computed much faster. 相似文献

12.

Robustness inequality for Markov control processes with unbounded costs

Evgueni I. Gordienko Francisco S. Salem 《Systems & Control Letters》1998,33(2):1252

The paper discusses the robustness of discrete-time Markov control processes whose transition probabilities are known up to certain degree of accuracy. Upper bounds of increase of a discounted cost are derived when using an optimal control policy of the approximating process in order to control the original one. Bounds are given in terms of weighted total variation distance between transition probabilities. They hold for processes on Borel spaces with unbounded one-stage costs functions. 相似文献

13.

Optimal control of mean-field backward doubly stochastic systems driven by Itô-Lévy processes

Jinbiao Wu Zaiming Liu 《International journal of control》2020,93(4):953-970

ABSTRACT

In this paper, we introduce a new class of backward doubly stochastic differential equations (in short BDSDE) called mean-field backward doubly stochastic differential equations (in short MFBDSDE) driven by Itô-Lévy processes and study the partial information optimal control problems for backward doubly stochastic systems driven by Itô-Lévy processes of mean-field type, in which the coefficients depend on not only the solution processes but also their expected values. First, using the method of contraction mapping, we prove the existence and uniqueness of the solutions to this kind of MFBDSDE. Then, by the method of convex variation and duality technique, we establish a sufficient and necessary stochastic maximum principle for the stochastic system. Finally, we illustrate our theoretical results by an application to a stochastic linear quadratic optimal control problem of a mean-field backward doubly stochastic system driven by Itô-Lévy processes. 相似文献

14.

The Kalman-Bucy Filter for Linear Stochastic Dynamic Systems with Discontinuous Trajectories

V. Yu. Bereza V. K. Yasinskii 《Cybernetics and Systems Analysis》2003,39(2):235-245

An optimal linear filtration problem is considered in the paper based on Kalman-Bucy results. The sequential linear regression method being a modification of fundamental Wiener results is used. 相似文献

15.

Quasi-Monte Carlo methods for Markov chains with continuous multi-dimensional state space

R. El Haddad C. Lécot 《Mathematics and computers in simulation》2010,81(3):560-567

We describe a quasi-Monte Carlo method for the simulation of discrete time Markov chains with continuous multi-dimensional state space. The method simulates copies of the chain in parallel. At each step the copies are reordered according to their successive coordinates. We prove the convergence of the method when the number of copies increases. We illustrate the method with numerical examples where the simulation accuracy is improved by large factors compared with Monte Carlo simulation. 相似文献

16.

Optimal Control of Forward‐Backward Stochastic Jump‐Diffusion Differential Systems with Observation Noises: Stochastic Maximum Principle

Meijiao Wang Qiuhong Shi Qingxin Meng 《Asian journal of control》2021,23(1):241-254

This paper is concerned with a partially observed optimal control problem for a controlled forward‐backward stochastic system with correlated noises between the system and the observation, which generalizes the result of a previous work to a jump‐diffusion system. Under some convexity assumptions, necessary and sufficient optimality conditions for such an optimal control are established in the form of Pontryagin type maximum principle in a unified way by means of duality analysis and convex variational techniques 相似文献

17.

Stochastic consensus control with finite frequency specification for Markov jump networks

下载免费PDF全文

Xiaoli Luan Chaojie Zhou Zhengtao Ding Fei Liu 《国际强度与非线性控制杂志
》2016,26(13):2961-2974

A stochastic finite frequency consensus protocol for directed networks with Markov jump topologies and external disturbances is proposed in this paper. By introducing the frequency band information into consensus control design, the disagreement dynamics of interconnected networks asymptotically converge to zero with an improved level of disturbance attenuation in the specific frequency band. In addition, the new model transformation approach is presented by exploring certain features of Laplacian matrix in real Jordan form, which leads to more generality of the designed protocol. A numerical example validates the potential of the developed results. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

18.

Markov 控制过程在紧致行动集上的迭代优化算法 总被引：5，自引：0，他引：5

下载免费PDF全文

唐昊奚宏生殷保群《控制与决策》2003,18(3):267-271

研究一类连续时间Markov控制过程(CTMCP)在紧致行动集上关于平均代价性能准则的优化算法。根据CTMCP的性能势公式和平均代价最优性方程，导出了求解最优或次最优平稳控制策略的策略迭代算法和数值迭代算法，在无需假设迭代算子是sp—压缩的条件下，给出了这两种算法的收敛性证明。最后通过分析一个受控排队网络的例子说明了这种方法的优越性。相似文献

19.

An actor–critic algorithm with function approximation for discounted cost constrained Markov decision processes

Shalabh Bhatnagar 《Systems & Control Letters》2010,59(12):760-766

相似文献

20.

Necessary and sufficient conditions for near-optimality of stochastic delay systems

Yuan Wang 《International journal of control》2018,91(8):1730-1744

相似文献