首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 125 毫秒
1.
Markov 控制过程在紧致行动集上的迭代优化算法   总被引:5,自引:0,他引:5       下载免费PDF全文
研究一类连续时间Markov控制过程(CTMCP)在紧致行动集上关于平均代价性能准则的优化算法。根据CTMCP的性能势公式和平均代价最优性方程,导出了求解最优或次最优平稳控制策略的策略迭代算法和数值迭代算法,在无需假设迭代算子是sp—压缩的条件下,给出了这两种算法的收敛性证明。最后通过分析一个受控排队网络的例子说明了这种方法的优越性。  相似文献   

2.
Markov控制过程基于性能势的平均代价最优策略   总被引:2,自引:1,他引:2  
研究了一类离散时间Markov控制过程平均代价性能最优控制决策问题.应用 Markov性能势的基本性质,在很一般性的假设条件下,直接导出了无限时间平均代价模型在紧 致行动集上的最优性方程及其解的存在性定理.提出了求解最优平稳控制策略的迭代算法,并 讨论了这种算法的收敛性问题.最后通过分析一个实例来说明这种算法的应用.  相似文献   

3.
一类受控闭排队网络基于性能势的最优性方程   总被引:1,自引:0,他引:1  
研究一类受控闭排队网络系统的性能优化问题. 文章引进了两个基本概念: 折扣代价α 性能势和平均代价性能势, 并且讨论了这两个性能势之间的一个关系式. 在一般的假设条件下, 我们应用性能势的基本性质直接建立了无限时间水平平均代价模型的最优性方程, 并且证明了在紧致集上最优解的存在性. 最后给出了一个策略优化的迭代算法并通过一个实际算例以说明该算法的效果.  相似文献   

4.
林小峰  张衡  宋绍剑  宋春宁 《控制与决策》2011,26(10):1586-1590
为了获得非线性离散时间系统的最优控制策略,基于自适应动态规划的原理,提出了一种带误差限的自适应动态规划方法.对于一个任意的状态,用一个有限长度的控制序列近似最优控制序列,使性能指标与最优性能指标的误差在一个较小的范围内.选取一个非线性离散时间系统对算法的性能进行数值实验,结果验证了该算法的有效性,用较少的计算代价获得了近似最优的控制策略.  相似文献   

5.
程连贞  刘凯  张军 《计算机学报》2007,30(7):1064-1073
为了解决低轨卫星网络中现有典型源组播算法的信道资源浪费问题,提出了一套单核共享树组播算法,即核心群合并共享树(CCST)和加权CCST(w-CCST)算法.CCST算法包括动态近似中心(DAC)选核方法和核心群合并组播路径构建方法.DAC方法根据组成员在网络中的分布情况自适应选择最优核;在核心群合并方法中,以核节点作为初始核心群,通过核心群和剩余组成员的最短路径方法逐步扩展直至整棵组播树构建完成,从而使得组播树的树代价最小,大大提高了网络的传输带宽利用率和传输效率.在w-CCST算法中,可以通过调整加权因子来适度增大树代价、降低端到端传播时延以满足某些端到端时延要求苛刻的实时组播业务.最后,通过仿真与其它算法进行了性能对比,仿真结果说明CCST组播树的平均树代价比其它组播树显著降低,平均端到端传播时延比其它组播树稍高;w-CCST算法的平均端到端传播时延性能好于CCST算法,树代价性能稍差,说明使用加权因子可以在组播树的树代价和端到端传播时延性能之间作折中.  相似文献   

6.
多数处理器中采用多级包含的cache存储层次,现有的末级cache块替换算法带来的性能开销较大.针对该问题,提出一种优化的末级cache块替换算法PLI,在选择丢弃块时考虑其在上级cache的访问频率,以较小的代价选出最优的LLC替换块.在时钟精确模拟器上的评测结果表明,该算法较原算法性能平均提升7%.  相似文献   

7.
首先分别在折扣代价与平均代价性能准则下,讨论了一类半M arkov决策问题.基于性能势方法,导出了由最优平稳策略所满足的最优性方程.然后讨论了两种模型之间的关系,表明了平均模型的有关结论,可以通过对折扣模型相应结论取折扣因子趋于零时的极限来得到.  相似文献   

8.
针对工作流任务调度优化问题,提出一种云工作流任务调度遗传算法。为了寻找工作流执行时间与执行代价的同步最优解,建立了遗传调度模型。在个体编码方面,采用了一种二维排列编码方法,可以更好地展现工作流任务间的执行次序;综合考虑任务执行代价与最早完成时间两个因素,设计了一种均衡适应度函数;为了丰富种群个体多样性,引入三种遗传交叉操作和两种遗传变异操作,以产生新的个体,增加了最优解的求解概率。通过数值仿真实验,在多个性能指标上对算法进行分析。结果表明,该调度算法能更好地平衡执行代价与调度效率,性能优于同类算法。  相似文献   

9.
路径节点驱动的低代价最短路径树算法   总被引:2,自引:0,他引:2  
Dijkstra算法是一个优秀的最短路径求解算法,同时也产生一棵最短路径树SPT(shortest path tree);该算法在网络计算与优化中得到了广泛的应用.为了对最短路径树进行代价优化,提出了路径节点驱动的思想.基于这种思想设计了路径节点驱动的最低代价最短路径树算法LCSPT(least-cost shortest path tree algorithm).通过LCSPT算法一个正计算节点能够最大化与当前最短路径树中的路径共享,因而进一步优化SPT树代价性能,生成高性能的SPT树.作为算法的重要组成部分,使用数学归纳法证明了算法的正确性;从理论上分析了LCSPT算法的代价性能,以及和同类算法相比如何取得最小代价性能;同时,对其时间复杂度和空间复杂度进行了分析.最后通过3个仿真实验验证了该算法在构建SPT时的正确性和其最小代价最短路径树特性.  相似文献   

10.
一种基于均值的云自适应粒子群算法   总被引:1,自引:1,他引:0  
本文基于云理论把粒子群分为三个种群,用云方法修改粒子群算法中惯性权重,同时修改速度更新公式中"认知部分"和"社会部分",引入"均值"的概念,提出了一种基于均值的云自适应粒子群算法。该方法的最大优点是克服了粒子群算法在迭代后期,当一些粒子的个体极值对应的适应度值与全局极值对应的适应度值相差明显时,不能收敛到最优解的缺点。数值实验结果表明,该算法经过较少的迭代次数,就能找到最优解,且平均运算时间减少,降低了算法的平均时间代价。  相似文献   

11.
The paper is concerned with the robust control problems for exponential controlled closed queuing networks (CCQNs) under uncertain routing probabilities.As the rows of some parameter matrices such as infinitesimal generators may be dependent,we first transform the objective vector under discounted-cost criteria into a weighed-average cost.Through the solution to Poisson equation, i.e.,Markov performance potentials,we then unify both discounted-cost and average-cost problems to study,and derive the gradient formula of the new objective function with respect to the routing probabilities.Some solution techniques are related for searching the optimal robust control policy. Finally,a numerical example is presented and analyzed.  相似文献   

12.
具有不确定性路径概率的闭排队网络鲁棒控制策略   总被引:1,自引:0,他引:1  
The paper is concerned with the robust control problems for exponential controlled closed queuing networks (CCQNs) under uncertain routing probabilities. As the rows of some parameter matrices such as infinitesimal generators may be dependent, we first transform the objective vector under discounted-cost criteria into a weighed-average cost. Through the solution to Poisson equation, i.e., Markov performance potentials, we then unify both discounted-cost and average-cost problems to study, and derive the gradient formula of the new objective function with respect to the routing probabilities. Some solution techniques are related for searching the optimal robust control policy. Finally, a numerical example is presented and analyzed.  相似文献   

13.
An efficient usage of available resources is a substantial requirement for the successful design of networked control systems. Recent results indicate major benefits of event-based control compared to conventional designs, when resources such as communication, energy, and computation, are sparse. This paper considers multiple entities of heterogeneous control systems whose feedback loops are coupled through a common communication medium. The design of the decentralized event-triggering control system is formulated as an average-cost problem that aims at the minimization of a social cost criterion. A state aggregation technique is used to develop a bi-level design method, which divides into a local average-cost problem within every subsystem and a global resource allocation problem assigning optimal transmission rates to every subsystem. Stability conditions are derived that guarantee stochastic stability of the aggregate system. Under these conditions, it is shown that the design approach is asymptotically optimal as the number of subsystems increases.  相似文献   

14.
Reinforcement learning (RL) is concerned with the identification of optimal controls in Markov decision processes (MDPs) where no explicit model of the transition probabilities is available. We propose a class of RL algorithms which always produces stable estimates of the value function. In detail, we use "local averaging" methods to construct an approximate dynamic programming (ADP) algorithm. Nearest-neighbor regression, grid-based approximations, and trees can all be used as the basis of this approximation. We provide a thorough theoretical analysis of this approach and we demonstrate that ADP converges to a unique approximation in continuous-state average-cost MDPs. In addition, we prove that our method is consistent in the sense that an optimal approximate strategy is identified asymptotically. With regard to a practical implementation, we suggest a reduction of ADP to standard dynamic programming in an artificial finite-state MDP.  相似文献   

15.
We propose a unified framework to Markov decision problems and performance sensitivity analysis for multichain Markov processes with both discounted and average-cost performance criteria. With the fundamental concept of performance potentials, we derive both performance-gradient and performance-difference formulas, which play the central role in performance optimization. The standard policy iteration algorithms for both discounted- and average-reward MDPs can be established using the performance-difference formulas in a simple and intuitive way; and the performance-gradient formulas together with stochastic approximation may lead to new optimization schemes. This sensitivity-based point of view of performance optimization provides some insights that link perturbation analysis, Markov decision processes, and reinforcement learning together. The research is an extension of the previous work on ergodic Markov chains (Cao, Automatica 36 (2000) 771).  相似文献   

16.

In this paper, we present a novel approach to attain fourth-order approximate solution of 2D quasi-linear elliptic partial differential equation on an irrational domain. In this approach, we use nine grid points with dissimilar mesh in a single compact cell. We also discuss appropriate fourth-order numerical methods for the solution of the normal derivatives on a dissimilar mesh. The method has been protracted for solving system of quasi-linear elliptic equations. The convergence analysis is discussed to authenticate the proposed numerical approximation. On engineering applications, we solve various test problems, such as linear convection–diffusion equation, Burgers’equation, Poisson equation in singular form, NS equations, bi- and tri-harmonic equations and quasi-linear elliptic equations to show the efficiency and accuracy of the proposed methods. A comprehensive comparative computational experiment shows the accuracy, reliability and credibility of the proposed computational approach.

  相似文献   

17.
曲面的变分设计方法在构造高质量的曲面方面显示出了明显的优越性.通过对Greiner所提出的三阶能量泛函进行变分,得到相应的Euler-Lagrange方程,并构造了一个新的六阶梯度流.采用类差分法对所构造的几何流进行数值求解,并用其解决几何设计中的各种问题,包括曲面处理、 N-边洞填补方面以及曲面恢复等.实验结果表明,文中所构造的几何流确能产生高质量的曲面.  相似文献   

18.
In this paper, we study the simulation of nonlinear Schrödinger equation in one, two and three dimensions. The proposed method is based on a time-splitting method that decomposes the original problem into two parts, a linear equation and a nonlinear equation. The linear equation in one dimension is approximated with the Chebyshev pseudo-spectral collocation method in space variable and the Crank–Nicolson method in time; while the nonlinear equation with constant coefficients can be solved exactly. As the goal of the present paper is to study the nonlinear Schrödinger equation in the large finite domain, we propose a domain decomposition method. In comparison with the single-domain, the multi-domain methods can produce a sparse differentiation matrix with fewer memory space and less computations. In this study, we choose an overlapping multi-domain scheme. By applying the alternating direction implicit technique, we extend this efficient method to solve the nonlinear Schrödinger equation both in two and three dimensions, while for the solution at each time step, it only needs to solve a sequence of linear partial differential equations in one dimension, respectively. Several examples for one- and multi-dimensional nonlinear Schrödinger equations are presented to demonstrate high accuracy and capability of the proposed method. Some numerical experiments are reported which show that this scheme preserves the conservation laws of charge and energy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号