首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 78 毫秒
1.
CTMDP基于随机平稳策略的仿真优化算法   总被引:2,自引:2,他引:2  
基于Markov性能势理论和神经元动态规划(NDP)方法,研究一类连续时间Markov决策过程(MDP)在随机平稳策略下的仿真优化问题,给出的算法是把一个连续时间过程转换成其一致化Markov链,然后通过其单个样本轨道来估计平均代价性能指标关于策略参数的梯度,以寻找次优策略,该方法适合于解决大状态空间系统的性能优化问题.并给出了一个受控Markov过程的数值实例.  相似文献   

2.
Markov控制过程基于单个样本轨道的在线优化算法   总被引:3,自引:1,他引:3  
在Markov性能势理论基础上, 研究了Markov控制过程的性能优化算法. 不同于传统的基于计算的方法, 文中的算法是根据单个样本轨道的仿真来估计性能指标关于策略参数的梯度, 以寻找最优 (或次优 )随机平稳策略. 由于可根据不同实际系统的特征来选择适当的算法参数, 因此它能满足不同实际工程系统在线优化的需要. 最后简要分析了这些算法在一个无限长的样本轨道上以概率 1的收敛性, 并给出了一个三 状态受控Markov过程的数值实例.  相似文献   

3.
针对连续时间部分可观Markov决策过程(CTPOMDP)的优化问题,本文提出一种策略梯度估计方法. 运用一致化方法,将离散时间部分可观Markov决策过程(DTPOMDP)的梯度估计算法推广到连续时间模型, 研究了算法的收敛性和误差估计问题,并用一个数值例子来说明该算法的应用.  相似文献   

4.
研究了一类具有可数状态空间的Markov控制过程在无限水平平均代价准则下的最优平稳策略问题.对此类过程,引入了折扣Poisson方程,运用无穷小矩阵和性能势的基本性质,导出了平均代价模型在紧致行动集上的最优性方程,并证明了其解的一个存在性定理.  相似文献   

5.
在智能规划问题上,寻找规划解都是NP甚至NP完全问题,如果动作的执行效果带有不确定性,如在Markov决策过程的规划问题中,规划的求解将会更加困难,现有的Markov决策过程的规划算法往往用一个整体状态节点来描述某个动作的实际执行效果,试图回避状态内部的复杂性,而现实中的大量动作往往都会产生多个命题效果,对应多个命题节点。为了能够处理和解决这个问题,提出了映像动作,映像路节和映像规划图等概念,并在其基础上提出了Markov决策过程的蚁群规划算法,从而解决了这一问题。并且证明了算法得到的解,即使在不确定的执行环境下,也具有不低于一定概率的可靠性。  相似文献   

6.
首先分别在折扣代价与平均代价性能准则下,讨论了一类半M arkov决策问题.基于性能势方法,导出了由最优平稳策略所满足的最优性方程.然后讨论了两种模型之间的关系,表明了平均模型的有关结论,可以通过对折扣模型相应结论取折扣因子趋于零时的极限来得到.  相似文献   

7.
讨论一类半Markov控制过程(SMCP)的折扣代价性能优化问题.通过引入一个矩阵,该矩阵可作为一个Markov过程的无穷小矩阵,对一个SMCP定义了折扣Poisson方程,并由这个方程定义了α-势.基于α-势,给出了由最优平稳策略所满足的最优性方程.最后给出一个求解最优平稳策略的迭代算法,并提供一个数值例子以表明该算法的应用.  相似文献   

8.
动态电源管理的随机切换模型与策略优化   总被引:2,自引:0,他引:2  
提出一种基于连续时间Markov决策过程的动态电源管理策略优化方法.通过建立动态电源管理系统的随机切换模型,将动态电源管理问题转化为带约束的策略优化问题,并给出一种基于矢量合成的策略梯度优化算法.随机切换模型对动态电源管理系统的描述精确,策略优化算法简便有效,既能离线计算,也适用于在线优化.仿真实验验证了该方法的有效性.  相似文献   

9.
利用模糊数学相关理论,对具有可转移效用的动态合作博弈的区间模糊稳定集进行了研究。首先利用Markov随机过程对动态合作联盟的结构转移进行描述,并考虑到支付函数是三角模糊数的情形,构造了在不同置信度α下的合作博弈的截集取值区域,进而结合动态联盟状态转移矩阵计算出不同时刻点的区间模糊稳定集。考虑到盟友在合作结束后需要对具体的联盟收益进行分配,利用构造的区间模糊稳定集给出了盟友可行的收益分配势值区间。最后利用实例对该方法的有效性和可行性进行了说明。  相似文献   

10.
赵希人  刘胜 《自动化学报》1990,16(2):161-165
本文依据平稳随机过程谱分解理论,推导出具有非有理谱平稳随机过程的仿真过程的数学模型,并给出仿真误差公式,最后介绍它在海浪模拟中的应用.  相似文献   

11.
Generating pseudo random objects is one of the key issues in computer simulation of complex systems. Most earlier simulation systems include procedures for the generation of independent and identically distributed random variables or some classical random processes, such as white noise. In this paper we propose a new approach to the generation of wide ranges of processes that are characterized by marginal distribution and autocorrelation function that are significant in many cases. The proposed algorithm is based on the use of truncated distribution that gives more simplicity and efficiency in comparison with the previous one. The effectiveness of the proposed algorithm is verified using computer simulation of various real examples.  相似文献   

12.
We show the existence of average cost optimal stationary policies for Markov control processes with Borel state space and unbounded costs per stage, under a set of assumptions recently introduced by L.I. Sennott (1989) for control processes with countable state space and finite control sets.  相似文献   

13.
A genetic algorithm approach is used to solve a multi-objective discrete reliability optimization problem in a k dissimilar-unit non-repairable cold-standby redundant system. Each unit is composed of a number of independent components with generalized Erlang distributions arranged in a series–parallel configuration. There are multiple component choices with different distribution parameters available for being replaced with each component of the system. The objective of the reliability optimization problem is to select the best components, from the set of available components, to be placed in the standby system in order to minimize the initial purchase cost of the system, maximize the system MTTF (mean time to failure), minimize the system VTTF (variance of time to failure) and also maximize the system reliability at the mission time. Finally, we apply a genetic algorithm with double strings using continuous relaxation based on reference solution updating (GADSCRRSU) to solve this multi-objective problem, using goal attainment formulation. The results are also compared against the results of a discrete-time approximation technique to show the efficiency of the proposed GA approach.  相似文献   

14.
This paper studies the control policies of an M/G/1 queueing system with a startup and unreliable server, in which the length of the vacation period is controlled either by the number of arrivals during the idle period, or by a timer. After all the customers are served in the queue exhaustively, the server immediately takes a vacation and operates two different policies: (i) the server reactivates as soon as the number of arrivals in the queue reaches to a predetermined threshold N or the waiting time of the leading customer reaches T units; and (ii) the server reactivates as soon as the number of arrivals in the queue reaches to a predetermined threshold N or T time units have elapsed since the end of the completion period. If the timer expires or the number of arrivals exceeds the threshold N, then the server reactivates and requires a startup time before providing the service until the system is empty. Furthermore, it is assumed that the server breaks down according to a Poisson process and his repair time has a general distribution. We analyze the system characteristics for each scheme. The total expected cost function per unit time is developed to determine the optimal thresholds of N and T at a minimum cost.  相似文献   

15.
16.
17.
Firefly algorithm (FA) is a newer member of bio-inspired meta-heuristics, which was originally proposed to find solutions to continuous optimization problems. Popularity of FA has increased recently due to its effectiveness in handling various optimization problems. To enhance the performance of the FA even further, an adaptive FA is proposed in this paper to solve mechanical design optimization problems, and the adaptivity is focused on the search mechanism and adaptive parameter settings. Moreover, chaotic maps are also embedded into AFA for performance improvement. It is shown through experimental tests that some of the best known results are improved by the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号