首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 109 毫秒
1.
连续时间MCP在紧致行动集上的最优策略   总被引:10,自引:2,他引:8  
文中研究了一类连续时间Markov控制过程(CTMCP)无穷水平平均代价性能的最优控制决策问题.文章采用无穷小生成元和性能势的基本性质,直接导出了平均代价模型在紧致行动集上的最优性方程及其解的存在性定理,提出了求解ε-最优平稳控制策略的数值迭代算法,并给出了这种算法的收敛性证明.最后通过分析一个数值例子来说明这种方法的应用.  相似文献   

2.
Markov 控制过程在紧致行动集上的迭代优化算法   总被引:5,自引:0,他引:5       下载免费PDF全文
研究一类连续时间Markov控制过程(CTMCP)在紧致行动集上关于平均代价性能准则的优化算法。根据CTMCP的性能势公式和平均代价最优性方程,导出了求解最优或次最优平稳控制策略的策略迭代算法和数值迭代算法,在无需假设迭代算子是sp—压缩的条件下,给出了这两种算法的收敛性证明。最后通过分析一个受控排队网络的例子说明了这种方法的优越性。  相似文献   

3.
CTMDP基于随机平稳策略的仿真优化算法   总被引:4,自引:2,他引:2  
基于Markov性能势理论和神经元动态规划(NDP)方法,研究一类连续时间Markov决策过程(MDP)在随机平稳策略下的仿真优化问题,给出的算法是把一个连续时间过程转换成其一致化Markov链,然后通过其单个样本轨道来估计平均代价性能指标关于策略参数的梯度,以寻找次优策略,该方法适合于解决大状态空间系统的性能优化问题.并给出了一个受控Markov过程的数值实例.  相似文献   

4.
基于性能势理论和等价Markov过程方法,研究了一类半Markov决策过程(SMDP)在参数化随机平稳策略下的仿真优化算法,并简要分析了算法的收敛性.通过SMDP的等价Markov过程,定义了一个一致化Markov链,然后根据该一致化Markov链的单个样本轨道来估计SMDP的平均代价性能指标关于策略参数的梯度,以寻找最优(或次优)策略.文中给出的算法是利用神经元网络来逼近参数化随机平稳策略,以节省计算机内存,避免了“维数灾”问题,适合于解决大状态空间系统的性能优化问题.最后给出了一个仿真实例来说明算法的应用.  相似文献   

5.
一类受控闭排队网络基于性能势的最优性方程   总被引:1,自引:0,他引:1  
研究一类受控闭排队网络系统的性能优化问题. 文章引进了两个基本概念: 折扣代价α 性能势和平均代价性能势, 并且讨论了这两个性能势之间的一个关系式. 在一般的假设条件下, 我们应用性能势的基本性质直接建立了无限时间水平平均代价模型的最优性方程, 并且证明了在紧致集上最优解的存在性. 最后给出了一个策略优化的迭代算法并通过一个实际算例以说明该算法的效果.  相似文献   

6.
研究了一类具有可数状态空间的Markov控制过程在无限水平平均代价准则下的最优平稳策略问题.对此类过程,引入了折扣Poisson方程,运用无穷小矩阵和性能势的基本性质,导出了平均代价模型在紧致行动集上的最优性方程,并证明了其解的一个存在性定理.  相似文献   

7.
路径节点驱动的低代价最短路径树算法   总被引:2,自引:0,他引:2  
Dijkstra算法是一个优秀的最短路径求解算法,同时也产生一棵最短路径树SPT(shortest path tree);该算法在网络计算与优化中得到了广泛的应用.为了对最短路径树进行代价优化,提出了路径节点驱动的思想.基于这种思想设计了路径节点驱动的最低代价最短路径树算法LCSPT(least-cost shortest path tree algorithm).通过LCSPT算法一个正计算节点能够最大化与当前最短路径树中的路径共享,因而进一步优化SPT树代价性能,生成高性能的SPT树.作为算法的重要组成部分,使用数学归纳法证明了算法的正确性;从理论上分析了LCSPT算法的代价性能,以及和同类算法相比如何取得最小代价性能;同时,对其时间复杂度和空间复杂度进行了分析.最后通过3个仿真实验验证了该算法在构建SPT时的正确性和其最小代价最短路径树特性.  相似文献   

8.
智能电网弹性响应时间业务需求的接入控制   总被引:1,自引:0,他引:1  

考虑智能电网多种类型业务需求的接入控制, 通过利用响应时间的弹性来平缓业务负荷的波动, 使得电网运行的长期平均代价最小. 针对业务需求和用户行为的随机分布特性, 建立连续时间Markov 控制过程的系统分析模型; 结合性能势基于样本轨道的估计, 提出一种基于仿真的策略迭代优化算法, 有效缓解了系统大状态空间导致的维数灾问题, 具有较快的收敛速度和良好的应用效果. 仿真实验结果验证了所提出方法的有效性.

  相似文献   

9.
基于性能势的方法 ,研究了一类半Markov过程 (SMP)的性能灵敏度分析和平均费用下的性能优化问题 .将SMP转化为与之等价的离散时间Markov链 (DTMC) ,利用DTMC的性能势 ,对SMP进行灵敏度分析和性能优化 ,得到了SMP基于DTMC性能势的灵敏度分析公式和最优性方程 .最后给出了一个数值例子以表明该方法的应用 .  相似文献   

10.
讨论一类半Markov控制过程(SMCP)的折扣代价性能优化问题.通过引入一个矩阵,该矩阵可作为一个Markov过程的无穷小矩阵,对一个SMCP定义了折扣Poisson方程,并由这个方程定义了α-势.基于α-势,给出了由最优平稳策略所满足的最优性方程.最后给出一个求解最优平稳策略的迭代算法,并提供一个数值例子以表明该算法的应用.  相似文献   

11.
The average cost optimal control problem is addressed for Markov decision processes with unbounded cost. It is found that the policy iteration algorithm generates a sequence of policies which are c-regular, where c is the cost function under consideration. This result only requires the existence of an initial c-regular policy and an irreducibility condition on the state space. Furthermore, under these conditions the sequence of relative value functions generated by the algorithm is bounded from below and “nearly” decreasing, from which it follows that the algorithm is always convergent. Under further conditions, it is shown that the algorithm does compute a solution to the optimality equations and hence an optimal average cost policy. These results provide elementary criteria for the existence of optimal policies for Markov decision processes with unbounded cost and recover known results for the standard linear-quadratic-Gaussian problem. In particular, in the control of multiclass queueing networks, it is found that there is a close connection between optimization of the network and optimal control of a far simpler fluid network model  相似文献   

12.
The current study examines the dynamic vehicle allocation problems of the automated material handling system (AMHS) in semiconductor manufacturing. With the uncertainty involved in wafer lot movement, dynamically allocating vehicles to each intrabay is very difficult. The cycle time and overall tool productivity of the wafer lots are affected when a vehicle takes too long to arrive. In the current study, a Markov decision model is developed to study the vehicle allocation control problem in the AMHS. The objective is to minimize the sum of the expected long-run average transport job waiting cost. An interesting exhaustive structure in the optimal vehicle allocation control is found in accordance with the Markov decision model. Based on this exhaustive structure, an efficient algorithm is then developed to solve the vehicle allocation control problem numerically. The performance of the proposed method is verified by a simulation study. Compared with other methods, the proposed method can significantly reduce the waiting cost of wafer lots for AMHS vehicle transportation.  相似文献   

13.
Consider the Hidden Markov model where the realization of a single Markov chain is observed by a number of noisy sensors. The sensor scheduling problem for the resulting hidden Markov model is as follows: design an optimal algorithm for selecting at each time instant, one of the many sensors to provide the next measurement. Each measurement has an associated measurement cost. The problem is to select an optimal measurement scheduling policy, so as to minimize a cost function of estimation errors and measurement costs. The problem of determining the optimal measurement policy is solved via stochastic dynamic programming. Numerical results are presented.  相似文献   

14.
The ergodic or long-run average cost control problem for a partially observed finite-state Markov chain is studied via the associated fully observed separated control problem for the nonlinear filter. Dynamic programming equations for the latter are derived, leading to existence and characterization of optimal stationary policies.  相似文献   

15.
使用马氏决策过程研究了概率离散事件系统的最优控制问题.首先,通过引入费用函数、目标函数以及最优函数的定义,建立了可以确定最优监控器的最优方程.之后,又通过此最优方程获得了给定语言的极大可控、∈-包含闭语言.最后给出了获得最优费用与最优监控器的算法.  相似文献   

16.
We consider an infinite-horizon minimax optimal control problem for stochastic uncertain systems governed by a discrete-state uncertain continuous-time chain. Using existing risk-sensitive control results, a robust suboptimal absolutely stabilizing guaranteed cost controller is constructed. Conditions are presented under which this suboptimal controller is minimax optimal. We then present a numeric algorithm for calculating a robust (sub)optimal controller using a Markov chain approximation technique.   相似文献   

17.
The problem of access and service rate control in queuing systems as a general optimization problem for controlled Markov process with finite state space is considered. By using the dynamic programming approach we obtain the explicit form of the optimal control in the case of minimizing cost given as a mixture of an average queue length, number of lost jobs, and service resources. The problem is considered on a finite time interval in the case of nonstationary input flow. In this case we suggest the general procedure of the numerical solution which can be applied to a problems with constraints.  相似文献   

18.
This paper applies the Dantzig-Wolfe decomposition technique to control systems governed by Markov chains, and the three usual types of costs: 1) the average cost attained until a target state is reached, 2) discounted cost, 3) average cost per unit time. Additional systems constraints are allowed. A technique for subdividing or "essentially" decomposing the problem is developed, and a Markov interpretation is given to each subsystem. The special significance, for this problem, of the extreme points and rays of the subproblem, is discussed.  相似文献   

19.
An algorithm which solves, numerically, the simultaneous stabilization problem using a constant gain decentralized control law is presented. The algorithm is determined from the necessary conditions for minimizing an optimal control problem. The optimal control problem consists of n plant models each with its own cost function. The costs are summed to create an average cost function and equality constraints are added to yield the decentralized control structure. The algorithm can start with any stabilizing full state feedback gain for each model and will converge to the optimal constant feedback gain for all models assuming a solution exists. Examples of the algorithm are given for Kharitonov synthesis and optimal gain scheduled control law synthesis using output feedback  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号