期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The average cost optimality equation for Markov control processes on Borel spaces

Raúl Montes-de-Oca 《Systems & Control Letters》1994,22(5)

This paper deals with discrete-time Markov control processes with Borel state space, allowing unbounded costs and noncompact control sets. For these models, the existence of average optimal stationary policies has been recently established under very general assumptions, using an optimality inequality. Here we give a condition, which is a strengtened version of a variant of the ‘vanishing discount factor’ approach, for the optimality equation to hold. 相似文献

2.

Remarks on the existence of solutions to the average cost optimality equation in Markov decision processes

Emmanuel Fern ndez-Gaucherand Aristotle Arapostathis Steven I. Marcus 《Systems & Control Letters》1990,15(5):425-432

Necessary conditions are given for the existence of a bounded solution to the optimality equation arising in Markov decision processes, under a long-run, expected average cost criterion. The relationships of some of our results to known sufficient conditions are also shown. 相似文献

3.

Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains

Rolando Cavazos-Cadena 《Systems & Control Letters》1988,10(1)

We consider average reward Markov decision processes with discrete time parameter and denumerable state space. We are concerned with the following problem: Find necessary and sufficient conditions so that, for arbitrary bounded reward function, the corresponding average reward optimality equation has a bounded solution. This problem is solved for a class of systems including the case in which, under the action of any stationary policy, the state space is an irreducible positive recurrent class. 相似文献

4.

Denumerable continuous-time Markov decision processes with multiconstraints on average costs

Qiuli Liu Hangsheng Tan 《International journal of systems science》2013,44(3):576-585

This article deals with multiconstrained continuous-time Markov decision processes in a denumerable state space, with unbounded cost and transition rates. The criterion to be optimised is the long-run expected average cost, and several kinds of constraints are imposed on some associated costs. The existence of a constrained optimal policy is ensured under suitable conditions by using a martingale technique and introducing an occupation measure. Furthermore, for the unichain model, we transform this multiconstrained problem into an equivalent linear programming problem, then construct a constrained optimal policy from an optimal solution to the linear programming. Finally, we use an example of a controlled queueing system to illustrate an application of our results. 相似文献

5.

受控排队系统的平均最优与约束平均最优

张兰兰郭先平《控制理论与应用》2009,26(2):139-144

根据连续时间马尔可夫决策过程的平均准则, 给出了一种特殊的马尔可夫决策过程－受控排队系统平均最优以及约束最优的新条件. 这个新条件仅使用模型的初始数据, 但利用了生灭过程的遍历性理论. 可以证明受控排队系统存在平均最优平稳策略与约束平均最优策略. 相似文献

6.

半Markov决策过程折扣模型与平均模型之间的关系

殷保群李衍杰唐昊代桂平奚宏生《控制理论与应用》2006,23(1):65-68

首先分别在折扣代价与平均代价性能准则下,讨论了一类半M arkov决策问题.基于性能势方法,导出了由最优平稳策略所满足的最优性方程.然后讨论了两种模型之间的关系,表明了平均模型的有关结论,可以通过对折扣模型相应结论取折扣因子趋于零时的极限来得到. 相似文献

7.

A note on the vanishing interest rate approach in average Markov decision chains with continuous and bounded costs

Rolando Cavazos-Cadena 《Systems & Control Letters》1995,24(5):373-383

This work considers denumerable state Markov decision processes with discrete time parameter. The performance of a control policy is measured by the (lim sup) expected average cost criterion, the action sets are compact metric and the cost function is continuous and bounded. Within this framework, necessary and sufficient conditions are given so that the vanishing interest rate (VIR) method — also known as the vanishing discount effect approach — yields an average optimal stationary policy. 相似文献

8.

Average cost optimal policies for Markov control processes with Borel state space and unbounded costs

Onsimo Hernndez-Lerma 《Systems & Control Letters》1990,15(4)

We show the existence of average cost optimal stationary policies for Markov control processes with Borel state space and unbounded costs per stage, under a set of assumptions recently introduced by L.I. Sennott (1989) for control processes with countable state space and finite control sets. 相似文献

9.

A note on the convergence rate of the value iteration scheme in controlled Markov chains

Rolando Cavazos-Cadena 《Systems & Control Letters》1998,33(4)

This work is concerned with controlled Markov chains with bounded costs. Assuming that the transition probabilities satisfy a simultaneous Doeblin condition, it is shown that Schweitzer’s transformation on the transition law yields a strong ergodicity condition that implies that the solution to the average cost optimality equation can be approximated, at a geometric rate, via the value iteration scheme. 相似文献

10.

Another set of conditions for average optimality in Markov control processes

Linn I. Sennott 《Systems & Control Letters》1995,24(2)

The existence of an average cost optimal stationary policy in a countable state Markov decision chain is shown under assumptions weaker than those given by Sennott (1989). This treatment is a modification of that given by Hu (1992), and is related to conditions of Hordijk (1977). An example is given for which the new axiom set holds whereas the axiom set of Sennott (1989) fails to hold. 相似文献

11.

A pause control approach to the value iteration scheme in average Markov decision processes

Rolando Cavazos-Cadena 《Systems & Control Letters》1998,33(4):61

This work concerns average Markov decision chains with denumerable state space. Assuming that the Lyapunov function condition holds, it is shown that the value iteration scheme yields convergent approximations to the solution of the average cost optimality equation. This result is obtained using a particular implementation of the value iteration procedure involving an artificial control action under which the system remains static. 相似文献

12.

Strong 1-optimal stationary policies in denumerable Markov decision processes

R. Cavazos-Cadena 《Systems & Control Letters》1988,11(1)

Usual conditions for existence of stationary average optimal policies in denumerable MDPs with general bounded rewards are shown to be also sufficient for strong 1-optimality. Moreover, we prove that all limit points of discounted optimal stationary policies when the discount factor goes to 1 are strong 1-optimal. 相似文献

13.

Markov decision processes with iterated coherent risk measures

Shanyun Chu 《International journal of control》2013,86(11):2286-2293

This paper considers a Markov decision process in Borel state and action spaces with the aggregated (or say iterated) coherent risk measure to be minimised. For this problem, we establish the Bellman optimality equation as well as the value and policy iteration algorithms, and show the existence of a deterministic stationary optimal policy. The cost function, while being allowed to be unbounded from below (in the sense that its negative part needs be bounded by some nonnegative real-valued possibly unbounded weight function), can be arbitrarily unbounded from above and possibly infinitely valued. 相似文献

14.

Markov decision processes with a target set for minimum criteria

Yoshio Ohtsubo 《International Transactions in Operational Research》2007,14(6):509-520

We consider Markov decisions processes with a target set, where criterion function is an expectation of minimum function. We formulate the problem as an infinite horizon case with a recurrent class. We show under some conditions that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also we give a policy improvement method. 相似文献

15.

A note on optimality conditions for continuous-time Markov decisionprocesses with average cost criterion

Xianping Guo Ke Liu 《Automatic Control, IEEE Transactions on》2001,46(12):1984-1989

This note deals with continuous-time Markov decision processes with a denumerable state space and the average cost criterion. The transition rates are allowed to be unbounded, and the action set is a Borel space. We give a new set of conditions under which the existence of optimal stationary policies is ensured by using the optimality inequality. Our results are illustrated with a controlled queueing model. Moreover, we use an example to show that our conditions do not imply the existence of a solution to the optimality equations in the previous literature 相似文献

16.

A note on the convergence rate of the value iteration scheme in controlled Markov chains

Rolando Cavazos-Cadena 《Systems & Control Letters》1998,33(4):61

This work is concerned with controlled Markov chains with bounded costs. Assuming that the transition probabilities satisfy a simultaneous Doeblin condition, it is shown that Schweitzer’s transformation on the transition law yields a strong ergodicity condition that implies that the solution to the average cost optimality equation can be approximated, at a geometric rate, via the value iteration scheme. 相似文献

17.

An actor–critic algorithm with function approximation for discounted cost constrained Markov decision processes

Shalabh Bhatnagar 《Systems & Control Letters》2010,59(12):760-766

相似文献

18.

Robustness inequality for Markov control processes with unbounded costs

Evgueni I. Gordienko Francisco S. Salem 《Systems & Control Letters》1998,33(2):1252

The paper discusses the robustness of discrete-time Markov control processes whose transition probabilities are known up to certain degree of accuracy. Upper bounds of increase of a discounted cost are derived when using an optimal control policy of the approximating process in order to control the original one. Bounds are given in terms of weighted total variation distance between transition probabilities. They hold for processes on Borel spaces with unbounded one-stage costs functions. 相似文献

19.

On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion

Shun-Pin Hsu Dong-Ming Chuang Ari Arapostathis 《Systems & Control Letters》2006,55(2):165-173

This paper studies the problem of the existence of stationary optimal policies for finite state controlled Markov chains, with compact action space and imperfect observations, under the long-run average cost criterion. It presents sufficient conditions for existence of solutions to the associated dynamic programming equation, that strengthen past results. There is a detailed discussion comparing the different assumptions commonly found in the literature. 相似文献

20.

基于马氏决策过程的概率离散事件系统最优控制

王飞冯祖仁胡奇英《控制理论与应用》2007,24(6):895-901

使用马氏决策过程研究了概率离散事件系统的最优控制问题.首先,通过引入费用函数、目标函数以及最优函数的定义,建立了可以确定最优监控器的最优方程.之后,又通过此最优方程获得了给定语言的极大可控、∈-包含闭语言.最后给出了获得最优费用与最优监控器的算法. 相似文献