首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.
We consider a linear-quadratic problem of minimax optimal control for stochastic uncertain control systems with output measurement. The uncertainty in the system satisfies a stochastic integral quadratic constraint. To convert the constrained optimization problem into an unconstrained one, a special S-procedure is applied. The resulting unconstrained game-type optimization problem is then converted into a risk-sensitive stochastic control problem with an exponential-of-integral cost functional. This is achieved via a certain duality relation between stochastic dynamic games and risk-sensitive stochastic control. The solution of the risk-sensitive stochastic control problem in terms of a pair of differential matrix Riccati equations is then used to establish a minimax optimal control law for the original uncertain system with uncertainty subject to the stochastic integral quadratic constraint. Date received: May 13, 1997. Date revised: March 18, 1998.  相似文献   

2.
We consider the zero-endpoint infinite-horizon LQ problem. We show that the existence of an optimal policy in the class of feedback controls is a sufficient condition for the existence of a stabilizing solution to the algebraic Riccati equation. This result is shown without assuming positive definiteness of the state weighting matrix. The feedback formulation of the optimization problem is natural in the context of differential games and we provide a characterization of feedback Nash equilibria both in a deterministic and stochastic context.  相似文献   

3.
研究了一类带Poisson跳扩散过程的线性二次随机微分博弈,包括非零和博弈的Nash均衡策略与零和博弈的鞍点均衡策略问题.利用微分博弈的最大值原理,得到Nash均衡策略的存在条件等价于两个交叉耦合的矩阵Riccati方程存在解,鞍点均衡策略的存在条件等价于一个矩阵Riccati方程存在解的结论,并给出了均衡策略的显式表达及最优性能泛函值.最后,将所得结果应用于现代鲁棒控制中的随机H2/H控制与随机H控制问题,得到了鲁棒控制策略的存在条件及显式表达,并验证所得结果在金融市场投资组合优化问题中的应用.  相似文献   

4.
We present new algorithms for determining optimal strategies for two-player games with proba- bilistic moves and reachability winning conditions. Such games, known as simple stochastic games, were extensively studied by A.Condon [Anne Condon. The complexity of stochastic games. Information and Computation, 96(2):203–224, 1992, Anne Condon. On algorithms for simple stochastic games. In Jin-Yi Cai, editor, Advances in Computational Complexity Theory, volume 13 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 51–73. AMS, 1993]. Many interesting problems, including parity games and hence also mu-calculus model checking, can be reduced to simple stochastic games. It is an open problem, whether simple stochastic games can be solved in polynomial time. Our algorithms determine the optimal expected payoffs in the game. We use geometric interpre- tation of the search space as a subset of the hyper-cube [0,1]N. The main idea is to divide this set into convex subregions in which linear optimization methods can be used. We show how one can proceed from one subregion to the other so that, eventually, a region containing the optinal payoffs will be found. The total number of subregions is exponential in the size of the game but, in practice, the algorithms need to visit only few of them to find a solution. We believe that our new algorithms could provide new insights into the difficult problem of deter- mining algorithmic complexity of simple stochastic games and other, equivallent problems.  相似文献   

5.
研究线性Markov切换系统的随机Nash微分博弈问题。首先借助线性Markov切换系统随机最优控制的相关结果,得到了有限时域和无线时域Nash均衡解的存在条件等价于其相应微分(代数) Riccati方程存在解,并给出了最优解的显式形式;然后应用相应的微分博弈结果分析线性Markov切换系统的混合H2/H∞控制问题;最后通过数值算例验证了所提出方法的可行性。  相似文献   

6.
In this paper, we consider the feedback control on nonzero-sum linear quadratic (LQ) differential games in finite horizon for discrete-time stochastic systems with Markovian jump parameters and multiplicative noise. Four-coupled generalized difference Riccati equations (GDREs) are obtained, which are essential to find the optimal Nash equilibrium strategies and the optimal cost values of the LQ differential games. Furthermore, an iterative algorithm is given to solve the four-coupled GDREs. Finally, a suboptimal solution of the LQ differential games is proposed based on a convex optimization approach and a simplification of the suboptimal solution is given. Simulation examples are presented to illustrate the effectiveness of the iterative algorithm and the suboptimal solution.  相似文献   

7.
This paper considers a discrete-time stochastic optimal control problem for which only measurement equation is partially observed with unknown constant parameters taking value in a finite set of stochastic systems. Because of the fact that the cost-to-go function at each stage contains variance and the non-separability of the variance is so complicated that the dynamic programming cannot be successfully applied, the optimal solution has not been found. In this paper, a new approach to the optimal solution is proposed by embedding the original non-separable problem into a separable auxiliary problem. The theoretical condition on which the optimal solution of the original problem can be attained from a set of solutions of the auxiliary problem is established. In addition, the optimality of the interchanging algorithm is proved and the analytical solution of the optimal control is also obtained. The performance of this controller is illustrated with a simple example.  相似文献   

8.
Algorithms and mathematical models for the technological process of primary oil refinery operating in the uncertain conditions are developed; the solution of the optimal control problem in the form of stochastic programming with probabilistic characteristics is presented. For solving the optimization problem, using the Lagrange method, the problem of development of decomposition algorithms is described and the method based on the transformation of the original problem according to the principle of deterministic analogue is proposed. The construction of the optimal control system created based on the developed models, optimization algorithm, and principles of automatic control of regime parameters of the primary oil refinery installation are considered.  相似文献   

9.
The average cost optimal control problem is addressed for Markov decision processes with unbounded cost. It is found that the policy iteration algorithm generates a sequence of policies which are c-regular, where c is the cost function under consideration. This result only requires the existence of an initial c-regular policy and an irreducibility condition on the state space. Furthermore, under these conditions the sequence of relative value functions generated by the algorithm is bounded from below and “nearly” decreasing, from which it follows that the algorithm is always convergent. Under further conditions, it is shown that the algorithm does compute a solution to the optimality equations and hence an optimal average cost policy. These results provide elementary criteria for the existence of optimal policies for Markov decision processes with unbounded cost and recover known results for the standard linear-quadratic-Gaussian problem. In particular, in the control of multiclass queueing networks, it is found that there is a close connection between optimization of the network and optimal control of a far simpler fluid network model  相似文献   

10.
Non-cooperative decision-making problems in a decentralized supply chain can be characterized and studied using a stochastic game model. In an earlier paper, the authors developed a methodology that uses machine learning for finding (near) optimal policies for non-zero sum stochastic games, and applied their methodology on an N-retailer and W-warehouse inventory-planning problem. The focus of this paper is on making the methodology more amenable to practical applications by making it completely simulation-based. It is also demonstrated, through numerical example problems, how this methodology can be used to find (near) equilibrium policies, and evaluate short-term rewards of stochastic games. Short-term rewards of stochastic games could be, in many instances, more critical than equilibrium rewards. To our knowledge, no methodology exists in the open literature that can capture the short-term behaviour of non-zero sum stochastic games as examined in this paper.  相似文献   

11.
This paper reports about applications of optimal control theory to the analysis of macroeconomic policies for Slovenia during its way into the Euro Area. For this purpose, the model SLOPOL4, a macroeconometric model for Slovenia, is used. Optimal policies are calculated using the OPTCON algorithm, an algorithm for determining (approximately) optimal solutions to deterministic and stochastic control problems. We determine optimal exchange rate and fiscal policies for Slovenia as solutions to optimum control problems with a quadratic objective function and the model SLOPOL4 as constraint. Several optimization experiments under different assumptions about the exchange rate regime are carried out. The sensitivity of the results with respect to several assumptions is investigated; in particular, the reaction of the optimal paths on varying the stochastic character of the model parameters is examined. If the stochastic nature of more parameters is taken into consideration, the resulting policies are closer to the deterministic solution than with only a few stochastic parameters.  相似文献   

12.
We describe a framework for analyzing probabilistic reachability and safety problems for discrete time stochastic hybrid systems within a dynamic games setting. In particular, we consider finite horizon zero-sum stochastic games in which a control has the objective of reaching a target set while avoiding an unsafe set in the hybrid state space, and a rational adversary has the opposing objective. We derive an algorithm for computing the maximal probability of achieving the control objective, subject to the worst-case adversary behavior. From this algorithm, sufficient conditions of optimality are also derived for the synthesis of optimal control policies and worst-case disturbance strategies. These results are then specialized to the safety problem, in which the control objective is to remain within a safe set. We illustrate our modeling framework and computational approach using both a tutorial example with jump Markov dynamics and a practical application in the domain of air traffic management.  相似文献   

13.
We consider a class of finite time horizon optimal control problems for continuous time linear systems with a convex cost, convex state constraints and non-convex control constraints. We propose a convex relaxation of the non-convex control constraints, and prove that the optimal solution of the relaxed problem is also an optimal solution for the original problem, which is referred to as the lossless convexification of the optimal control problem. The lossless convexification enables the use of interior point methods of convex optimization to obtain globally optimal solutions of the original non-convex optimal control problem. The solution approach is demonstrated on a number of planetary soft landing optimal control problems.  相似文献   

14.
Solves a finite-horizon partially observed risk-sensitive stochastic optimal control problem for discrete-time nonlinear systems and obtains small noise and small risk limits. The small noise limit is interpreted as a deterministic partially observed dynamic game, and new insights into the optimal solution of such game problems are obtained. Both the risk-sensitive stochastic control problem and the deterministic dynamic game problem are solved using information states, dynamic programming, and associated separated policies. A certainty equivalence principle is also discussed. The authors' results have implications for the nonlinear robust stabilization problem. The small risk limit is a standard partially observed risk-neutral stochastic optimal control problem  相似文献   

15.
ABSTRACT

In this paper, the preview control problem for a class of linear continuous time stochastic systems with multiplicative noise is studied based on the augmented error system method. First, a deterministic assistant system is introduced, and the original system is translated to the assistant system. Then, the integrator is employed to ensure the output of the closed-loop system tracking the reference signal accurately. Second, the augmented error system, which includes integrator vector, control vector and reference signal, is constructed based on the system after translation. As a result, the tracking problem is transformed into the optimal control problem of the augmented error system, and the optimal control input is obtained by the dynamic programming method. This control input is regarded as the preview controller of the original system. For a linear stochastic system with multiplicative noise, the difficulty being unable to construct an augmented error system by the derivation method is solved in this paper. And, the existence and uniqueness solution of the Riccati equation corresponding to the stochastic augmented error system is discussed. The numerical simulations show that the preview controller designed in this paper is very effective.  相似文献   

16.
讨论一类大规模系统的优化问题,提出一种递阶优化方法.该方法首先将原问题转化为多目标优化问题,证明了原问题的最优解在多目标优化问题的非劣解集中,给出了从多目标优化问题的解集中挑出原问题最优解的算法,建立了算法的理论基础.仿真结果验证了算法的有效性.  相似文献   

17.
In this paper, the parametric optimization method is used to find optimal control laws for fractional systems. The proposed approach is based on the use for the fractional variational iteration method to convert the original optimal control problem into a nonlinear optimization one. The control variable is parameterized by unknown parameters to be determined, then its expression is substituted into the system state‐space model. The resulting fractional ordinary differential equations are solved by the fractional variational iteration method, which provides an approximate analytical expression of the closed‐form solution of the state equations. This solution is a function of time and the unknown parameters of the control law. By substituting this solution into the performance index, the original fractional optimal control problem reduces to a nonlinear optimization problem where the unknown parameters, introduced in the parameterization procedure, are the optimization variables. To solve the nonlinear optimization problem and find the optimal values of the control parameters, the Alienor global optimization method is used to achieve the global optimal values of the control law parameters. The proposed approach is illustrated by two application examples taken from the literature.  相似文献   

18.
Sampled fictitious play (SFP) is a recently proposed iterative learning mechanism for computing Nash equilibria of non-cooperative games. For games of identical interests, every limit point of the sequence of mixed strategies induced by the empirical frequencies of best response actions that players in SFP play is a Nash equilibrium. Because discrete optimization problems can be viewed as games of identical interests wherein Nash equilibria define a type of local optimum, SFP has recently been employed as a heuristic optimization algorithm with promising empirical performance. However, there have been no guarantees of convergence to a globally optimal Nash equilibrium established for any of the problem classes considered to date. In this paper, we introduce a variant of SFP and show that it converges almost surely to optimal policies in model-free, finite-horizon stochastic dynamic programs. The key idea is to view the dynamic programming states as players, whose common interest is to maximize the total multi-period expected reward starting in a fixed initial state. We also offer empirical results suggesting that our SFP variant is effective in practice for small to moderate sized model-free problems.  相似文献   

19.
In this paper, we consider a two-player stochastic differential game problem over an infinite time horizon where the players invoke controller and stopper strategies on a nonlinear stochastic differential game problem driven by Brownian motion. The optimal strategies for the two players are given explicitly by exploiting connections between stochastic Lyapunov stability theory and stochastic Hamilton–Jacobi–Isaacs theory. In particular, we show that asymptotic stability in probability of the differential game problem is guaranteed by means of a Lyapunov function which can clearly be seen to be the solution to the steady-state form of the stochastic Hamilton–Jacobi–Isaacs equation, and hence, guaranteeing both stochastic stability and optimality of the closed-loop control and stopper policies. In addition, we develop optimal feedback controller and stopper policies for affine nonlinear systems using an inverse optimality framework tailored to the stochastic differential game problem. These results are then used to provide extensions of the linear feedback controller and stopper policies obtained in the literature to nonlinear feedback controllers and stoppers that minimise and maximise general polynomial and multilinear performance criteria.  相似文献   

20.
带马尔科夫跳和乘积噪声的随机系统的最优控制   总被引:1,自引:0,他引:1  
孔淑兰  张召生 《自动化学报》2012,38(7):1113-1118
讨论了N个选手随机系统的最优控制问题. 设计了无限时间的带有马尔科夫跳和乘积噪声的随机系统的Pareto最优控制器. 应用推广的Lyapunov方法和解随机Riccati代数方程得到了系统的Pareto最优解, 证明了最优控制器是稳定的反馈控制器, 以及对应于最优控制器的反馈增益中的随机Riccati代数方程的解是最小解.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号