期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Adaptive control of constrained Markov chains

Altman E. Schwartz A. 《Automatic Control, IEEE Transactions on》1991,36(4):454-462

Adaptive control of finite-state Markov chains is discussed, The optimal performance is characterized through the minimization of a long-run average cost functional, subject to constraints on several other such functionals. Under mild structural and feasibility conditions, two explicit adaptive control policies are exhibited for the case where the transition probabilities are unknown. The policies are optimal under the constrained optimization criterion. They rely on a powerful estimation scheme which provides consistent estimators for the transition probabilities. This scheme is of independent interest, as it provides strong consistency under a large number of adaptive schemes and is independent of any identifiability conditions. As an application, an optimal adaptive policy is derived for a system of K competing queues with countable state space, for which the constrained criteria arise naturally in the context of communication networks 相似文献

2.

A counterexample on the optimality equation in Markov decision chains with the average cost criterion

Rolando Cavazos-Cadena 《Systems & Control Letters》1991,16(5)

We consider Markov decision processes with denumerable state space and finite control sets; the performance index of a control policy is a long-run expected average cost criterion and the cost function is bounded below. For these models, the existence of average optimal stationary policies was recently established in [11] under very general assumptions. Such a result was obtained via an optimality inequality. Here, we use a simple example to prove that the conditions in [11] do not imply the existence of a solution to the average cost optimality equation. 相似文献

3.

Adaptive control of Markov chains, I: Finite parameter set

Borkar V. Varaiya P. 《Automatic Control, IEEE Transactions on》1979,24(6):953-957

Consider a controlled Markov chain whose transition probabilities depend upon an unknown parameter α taking values in finite setA. To each α is associated a prespecified stationary control lawphi(alpha). The adaptive control law selects at each timetthe control action indicated byphi(alpha_{t})where αtis the maximum likelihood estimate of α. It is shown that αtconverges to a parameter α^*such that the "closed-loop" transition probabilities corresponding to α^*andphi(alpha^{ast})are the same as those corresponding to α⁰andphi(alpha)where α⁰is the true parameter. The situation when α⁰does not belong to the model setAis briefly discussed. 相似文献

4.

The average cost optimality equation for Markov control processes on Borel spaces

Raúl Montes-de-Oca 《Systems & Control Letters》1994,22(5)

This paper deals with discrete-time Markov control processes with Borel state space, allowing unbounded costs and noncompact control sets. For these models, the existence of average optimal stationary policies has been recently established under very general assumptions, using an optimality inequality. Here we give a condition, which is a strengtened version of a variant of the ‘vanishing discount factor’ approach, for the optimality equation to hold. 相似文献

5.

Multilayer control of large Markov chains

Forestier J.-P. Varaiya P. 《Automatic Control, IEEE Transactions on》1978,23(2):298-305

The computational burden associated with controlling a plant modeled as a Markov chain with a large number of states is addressed by proposing a two-layer feedback control structure. At the lower layer a regulator continuously monitors the plant. When the state of the plant reaches an extreme value, the supervisor at the higher layer intervenes to reset the regulator. It is shown that the plant dynamics and cost originally defined at the lower layer can be "lifted" to the supervisor layer and that the supervisor's control task can be defined in a way that permits wide flexibility in the design of the regulator. 相似文献

6.

Towards the optimal control of Markov chains with constraints

Boris Miller Author Vitae Gregory Miller^{Author Vitae} 《Automatica》2010,46(9):1495-1502

An optimal control problem with constraints is considered on a finite interval for a non-stationary Markov chain with a finite state space. The constraints are given as a set of inequalities. The optimal solution existence is proved under a natural assumption that the set of admissible controls is non-empty. The stochastic control problem is reduced to a deterministic one and it is shown that the optimal solution satisfies the maximum principle, moreover it can be chosen within a class of Markov controls. On the basis of this result an approach to the numerical solution is proposed and its implementation is illustrated by examples. 相似文献

7.

Adaptive control of linear systems with Markov perturbations

Dufour F. Elliott R.J. 《Automatic Control, IEEE Transactions on》1998,43(3):351-372

The stochastic model considered is a linear jump diffusion process X for which the coefficients and the jump processes depend on a Markov chain Z with finite state space. First, we study the optimal filtering and control problem for these systems with non-Gaussian initial conditions, given noisy observations of the state X and perfect measurements of Z. We derive a new sufficient condition which ensures the existence and the uniqueness of the solution of the nonlinear stochastic differential equations satisfied by the output of the filter. We study a quadratic control problem and show that the separation principle holds. Next, we investigate an adaptive control problem for a state process X defined by a linear diffusion for which the coefficients depend on a Markov chain, the processes X and Z being observed in independent white noises. Suboptimal estimates for the process X, Z and approximate control law are investigated for a large class of probability distributions of the initial state. Asymptotic properties of these filters and this control law are obtained. Upper bounds for the corresponding error are given 相似文献

8.

Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion

Guo X. Hernandez-Lerma O. 《Automatic Control, IEEE Transactions on》2003,48(2):236-245

We give conditions for the existence of average optimal policies for continuous-time controlled Markov chains with a denumerable state-space and Borel action sets. The transition rates are allowed to be unbounded, and the reward/cost rates may have neither upper nor lower bounds. In the spirit of the "drift and monotonicity" conditions for continuous-time Markov processes, we propose a new set of conditions on the controlled process' primitive data under which the existence of optimal (deterministic) stationary policies in the class of randomized Markov policies is proved using the extended generator approach instead of Kolmogorov's forward equation used in the previous literature, and under which the convergence of a policy iteration method is also shown. Moreover, we use a controlled queueing system to show that all of our conditions are satisfied, whereas those in the previous literature fail to hold. 相似文献

9.

A note on optimality conditions for continuous-time Markov decisionprocesses with average cost criterion

Xianping Guo Ke Liu 《Automatic Control, IEEE Transactions on》2001,46(12):1984-1989

This note deals with continuous-time Markov decision processes with a denumerable state space and the average cost criterion. The transition rates are allowed to be unbounded, and the action set is a Borel space. We give a new set of conditions under which the existence of optimal stationary policies is ensured by using the optimality inequality. Our results are illustrated with a controlled queueing model. Moreover, we use an example to show that our conditions do not imply the existence of a solution to the optimality equations in the previous literature 相似文献

10.

Learning control of finite Markov chains with unknown transition probabilities

Sato M. Abe K. Takeda H. 《Automatic Control, IEEE Transactions on》1982,27(2):502-505

For a Markovian decision problem in which the transition probabilities are unknown, two learning algorithms are devised from the viewpoint of asymptotic optimality. Each time the algorithms select decisions to be used on the basis of not only the estimates of the unknown probabilities but also uncertainty of them. It is shown that the algorithms are asymptotically optimal in the sense that the probability of selecting an optimal policy converges to unity. 相似文献

11.

Causal coding and control for Markov chains

P. Varaiya J. Walrand 《Systems & Control Letters》1983,3(4):189-192

A Markov chain is controlled by a decision maker receiving his observations of the state via a noisy memoriless channel. That information is encoded causally. The encoder is assumed to have perfect channel feedback information.Separation results are derived and used to prove that encoding is useless for a class of symmetric channels.This paper extends the results of the authors (1983) by using methods similar to those of that paper. 相似文献

12.

Robust nonfragile guaranteed cost control for uncertain T-S fuzzy Markov jump systems with mode-dependent average dwell time and input constraint

Miao He 《International journal of systems science》2013,44(15):3146-3168

This paper aims to investigate the problem of robust nonfragile guaranteed cost control for uncertain Takagi-Sugeno fuzzy systems with Markov jump parameters, time-varying delay and input constraint. A nonfragile mode-dependent fuzzy controller with mode-dependent average dwell time (MDADT) is designed with input constraint. A sufficient condition is developed to ensure that the resulting closed-loop system is robust almost surely asymptotically stable with guaranteed cost index not exceeding the specified upper bound. Subsequently, the controller gain and upper bound of the guaranteed cost index can be obtained by solving a set of linear matrix inequalities. Finally, numerical and practical examples are provided to demonstrate the performance of the proposed approach. 相似文献

13.

Adaptive control of linear Markov jump systems

Daizhan Cheng 《International journal of systems science》2013,44(7):477-483

Stabilization of linear Markov jump systems via adaptive control is considered in this paper. The switching law is assumed to be unobservable Markov process. A sufficient condition is obtained for the stochastic stabilizability based on common quadratic Lyapunov functions (QLFs). The constructive proof provides a method to construct a sampling adaptive stabilizer. An example is used to describe the design of adaptive control, which stabilizes the system. 相似文献

14.

Adaptive guaranteed cost control of systems with uncertain parameters 总被引：10，自引：0，他引：10

Chang S. Peng T. 《Automatic Control, IEEE Transactions on》1972,17(4):474-483

Guaranteed cost control is a method of synthesizing a closed-loop system in which the controlled plant has large parameter uncertainty. This paper gives the basic theoretical development of guaranteed cost control, and shows how it can be incorporated into an adaptive system. The uncertainty in system parameters is reduced first by either: 1) on-line measurement and evaluation, or 2) prior knowledge on the parametric dependence of a certain easily measured situation parameter. Guaranteed cost control is then used to take up the residual uncertainty. It is shown that the uncertainty in system parameters can be taken care of by an additional term in the Riccati equation. A Fortran program for computing the guaranteed cost matrix and control law is developed and applied to an airframe control problem with large parameter variations. 相似文献

15.

Modeling genetic algorithms with Markov chains 总被引：12，自引：0，他引：12

Allen E. Nix Michael D. Vose 《Annals of Mathematics and Artificial Intelligence》1992,5(1):79-88

We model a simple genetic algorithm as a Markov chain. Our method is both complete (selection, mutation, and crossover are incorporated into an explicitly given transition matrix) and exact; no special assumptions are made which restrict populations or population trajectories. We also consider the asymptotics of the steady state distributions as population size increases.This research was supported by the National Science Foundation (IRI-8917545). 相似文献

16.

Pairwise Markov chains 总被引：1，自引：0，他引：1

Pieczynski W. 《IEEE transactions on pattern analysis and machine intelligence》2003,25(5):634-639

We propose a model called a pairwise Markov chain (PMC), which generalizes the classical hidden Markov chain (HMC) model. The generalization, which allows one to model more complex situations, in particular implies that in PMC the hidden process is not necessarily a Markov process. However, PMC allows one to use the classical Bayesian restoration methods like maximum a posteriori (MAP), or maximal posterior mode (MPM). So, akin to HMC, PMC allows one to restore hidden stochastic processes, with numerous applications to signal and image processing, such as speech recognition, image segmentation, and symbol detection or classification, among others. Furthermore, we propose an original method of parameter estimation, which generalizes the classical iterative conditional estimation (ICE) valid for a classical hidden Markov chain model, and whose extension to possibly non-Gaussian and correlated noise is briefly treated. Some preliminary experiments validate the interest of the new model. 相似文献

17.

Another set of conditions for average optimality in Markov control processes

Linn I. Sennott 《Systems & Control Letters》1995,24(2)

The existence of an average cost optimal stationary policy in a countable state Markov decision chain is shown under assumptions weaker than those given by Sennott (1989). This treatment is a modification of that given by Hu (1992), and is related to conditions of Hordijk (1977). An example is given for which the new axiom set holds whereas the axiom set of Sennott (1989) fails to hold. 相似文献

18.

An optimization-oriented approach to the adaptive control of Markov chains

Milito R. Cruz J. Jr. 《Automatic Control, IEEE Transactions on》1987,32(9):754-762

We consider the control of a dynamic system modeled as a Markov chain. The transition probability matrix of the Markov chain depends on the controluand also on an unknown parameter α⁰. The unknown parameter belongs to a given finite setA. The long run average cost depends on the control policy and the unknown parameter. Thus, a direct approach to the optimization of the performance is not feasible. A common procedure calls for an on-line estimation of the unknown parameter and the minimization of the cost functional using the estimate in lieu of the true parameter. It is well known that this "certainty equivalence" (CE) solution may fail to achieve optimal performance, even asymptotically. In this presentation of a new optimization-oriented approach to adaptive control, we consider a composite functional which simultaneously takes care of the estimation and control needs. The global minimum of this composite functional coincides with the minimum of the original cost functional. Thus, its joint minimization with respect to control and parameter estimates would yield the optimal control policy. This joint minimization is not feasible, but it suggests an algorithm that asymptotically achieves the desired goal. The transient behavior of the algorithm, as well as the situation whenalpha^{0}notin Aare also investigated. 相似文献

19.

Control of Markov chains with safety bounds

Arapostathis A. Kumar R. Hsu S.-P. 《Automation Science and Engineering, IEEE Transactions on》2005,2(4):333-343

相似文献

20.

A note on the vanishing interest rate approach in average Markov decision chains with continuous and bounded costs

Rolando Cavazos-Cadena 《Systems & Control Letters》1995,24(5):373-383

This work considers denumerable state Markov decision processes with discrete time parameter. The performance of a control policy is measured by the (lim sup) expected average cost criterion, the action sets are compact metric and the cost function is continuous and bounded. Within this framework, necessary and sufficient conditions are given so that the vanishing interest rate (VIR) method — also known as the vanishing discount effect approach — yields an average optimal stationary policy. 相似文献