共查询到17条相似文献,搜索用时 78 毫秒
1.
2.
Markov控制过程基于单个样本轨道的在线优化算法 总被引:3,自引:1,他引:3
在Markov性能势理论基础上, 研究了Markov控制过程的性能优化算法. 不同于传统的基于计算的方法, 文中的算法是根据单个样本轨道的仿真来估计性能指标关于策略参数的梯度, 以寻找最优 (或次优 )随机平稳策略. 由于可根据不同实际系统的特征来选择适当的算法参数, 因此它能满足不同实际工程系统在线优化的需要. 最后简要分析了这些算法在一个无限长的样本轨道上以概率 1的收敛性, 并给出了一个三 状态受控Markov过程的数值实例. 相似文献
3.
4.
5.
在智能规划问题上,寻找规划解都是NP甚至NP完全问题,如果动作的执行效果带有不确定性,如在Markov决策过程的规划问题中,规划的求解将会更加困难,现有的Markov决策过程的规划算法往往用一个整体状态节点来描述某个动作的实际执行效果,试图回避状态内部的复杂性,而现实中的大量动作往往都会产生多个命题效果,对应多个命题节点。为了能够处理和解决这个问题,提出了映像动作,映像路节和映像规划图等概念,并在其基础上提出了Markov决策过程的蚁群规划算法,从而解决了这一问题。并且证明了算法得到的解,即使在不确定的执行环境下,也具有不低于一定概率的可靠性。 相似文献
6.
7.
8.
动态电源管理的随机切换模型与策略优化 总被引:2,自引:0,他引:2
提出一种基于连续时间Markov决策过程的动态电源管理策略优化方法.通过建立动态电源管理系统的随机切换模型,将动态电源管理问题转化为带约束的策略优化问题,并给出一种基于矢量合成的策略梯度优化算法.随机切换模型对动态电源管理系统的描述精确,策略优化算法简便有效,既能离线计算,也适用于在线优化.仿真实验验证了该方法的有效性. 相似文献
9.
利用模糊数学相关理论,对具有可转移效用的动态合作博弈的区间模糊稳定集进行了研究。首先利用Markov随机过程对动态合作联盟的结构转移进行描述,并考虑到支付函数是三角模糊数的情形,构造了在不同置信度α下的合作博弈的截集取值区域,进而结合动态联盟状态转移矩阵计算出不同时刻点的区间模糊稳定集。考虑到盟友在合作结束后需要对具体的联盟收益进行分配,利用构造的区间模糊稳定集给出了盟友可行的收益分配势值区间。最后利用实例对该方法的有效性和可行性进行了说明。 相似文献
10.
本文依据平稳随机过程谱分解理论,推导出具有非有理谱平稳随机过程的仿真过程的数学模型,并给出仿真误差公式,最后介绍它在海浪模拟中的应用. 相似文献
11.
Generating pseudo random objects is one of the key issues in computer simulation of complex systems. Most earlier simulation systems include procedures for the generation of independent and identically distributed random variables or some classical random processes, such as white noise. In this paper we propose a new approach to the generation of wide ranges of processes that are characterized by marginal distribution and autocorrelation function that are significant in many cases. The proposed algorithm is based on the use of truncated distribution that gives more simplicity and efficiency in comparison with the previous one. The effectiveness of the proposed algorithm is verified using computer simulation of various real examples. 相似文献
12.
We show the existence of average cost optimal stationary policies for Markov control processes with Borel state space and unbounded costs per stage, under a set of assumptions recently introduced by L.I. Sennott (1989) for control processes with countable state space and finite control sets. 相似文献
13.
Amir Azaron Cahit Perkgoz Hideki Katagiri Kosuke Kato Masatoshi Sakawa 《Computers & Operations Research》2009
A genetic algorithm approach is used to solve a multi-objective discrete reliability optimization problem in a k dissimilar-unit non-repairable cold-standby redundant system. Each unit is composed of a number of independent components with generalized Erlang distributions arranged in a series–parallel configuration. There are multiple component choices with different distribution parameters available for being replaced with each component of the system. The objective of the reliability optimization problem is to select the best components, from the set of available components, to be placed in the standby system in order to minimize the initial purchase cost of the system, maximize the system MTTF (mean time to failure), minimize the system VTTF (variance of time to failure) and also maximize the system reliability at the mission time. Finally, we apply a genetic algorithm with double strings using continuous relaxation based on reference solution updating (GADSCRRSU) to solve this multi-objective problem, using goal attainment formulation. The results are also compared against the results of a discrete-time approximation technique to show the efficiency of the proposed GA approach. 相似文献
14.
This paper studies the control policies of an M/G/1 queueing system with a startup and unreliable server, in which the length of the vacation period is controlled either by the number of arrivals during the idle period, or by a timer. After all the customers are served in the queue exhaustively, the server immediately takes a vacation and operates two different policies: (i) the server reactivates as soon as the number of arrivals in the queue reaches to a predetermined threshold N or the waiting time of the leading customer reaches T units; and (ii) the server reactivates as soon as the number of arrivals in the queue reaches to a predetermined threshold N or T time units have elapsed since the end of the completion period. If the timer expires or the number of arrivals exceeds the threshold N, then the server reactivates and requires a startup time before providing the service until the system is empty. Furthermore, it is assumed that the server breaks down according to a Poisson process and his repair time has a general distribution. We analyze the system characteristics for each scheme. The total expected cost function per unit time is developed to determine the optimal thresholds of N and T at a minimum cost. 相似文献
15.
16.
17.
Firefly algorithm (FA) is a newer member of bio-inspired meta-heuristics, which was originally proposed to find solutions to continuous optimization problems. Popularity of FA has increased recently due to its effectiveness in handling various optimization problems. To enhance the performance of the FA even further, an adaptive FA is proposed in this paper to solve mechanical design optimization problems, and the adaptivity is focused on the search mechanism and adaptive parameter settings. Moreover, chaotic maps are also embedded into AFA for performance improvement. It is shown through experimental tests that some of the best known results are improved by the proposed algorithm. 相似文献