首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 0 毫秒
1.
王涛  张化光 《控制与决策》2015,30(9):1674-1678

针对模型参数部分未知的随机线性连续时间系统, 通过策略迭代算法求解无限时间随机线性二次(LQ) 最优控制问题. 求解随机LQ最优控制问题等价于求随机代数Riccati 方程(SARE) 的解. 首先利用伊藤公式将随机微分方程转化为确定性方程, 通过策略迭代算法给出SARE 的解序列; 然后证明SARE 的解序列收敛到SARE 的解, 而且在迭代过程中系统是均方可镇定的; 最后通过仿真例子表明策略迭代算法的可行性.

  相似文献   

2.
This paper studies a continuous-time stochastic linear-quadratic (SLQ) optimal control problem on infinite-horizon. Combining the Kronecker product theory with an existing policy iteration algorithm, a data-driven policy iteration algorithm is proposed to solve the problem. In contrast to most existing methods that need all information of system coefficients, the proposed algorithm eliminates the requirement of three system matrices by utilizing data of a stochastic system. More specifically, this algorithm uses the collected data to iteratively approximate the optimal control and a solution of the stochastic algebraic Riccati equation (SARE) corresponding to the SLQ optimal control problem. The convergence analysis of the obtained algorithm is given rigorously, and a simulation example is provided to illustrate the effectiveness and applicability of the algorithm.  相似文献   

3.
In this paper, we consider risk‐sensitive optimal control and differential games for stochastic differential delayed equations driven by Brownian motion. The problems are related to robust stochastic optimization with delay due to the inherent feature of the risk‐sensitive objective functional. For both problems, by using the logarithmic transformation of the associated risk‐neutral problem, the necessary and sufficient conditions for the risk‐sensitive maximum principle are obtained. We show that these conditions are characterized in terms of the variational inequality and the coupled anticipated backward stochastic differential equations (ABSDEs). The coupled ABSDEs consist of the first‐order adjoint equation and an additional scalar ABSDE, where the latter is induced due to the nonsmooth nonlinear transformation of the adjoint process of the associated risk‐neutral problem. For applications, we consider the risk‐sensitive linear‐quadratic control and game problems with delay, and the optimal consumption and production game, for which we obtain explicit optimal solutions.  相似文献   

4.
《国际计算机数学杂志》2012,89(16):2259-2273
In this paper, a novel hybrid method based on two approaches, evolutionary algorithms and an iterative scheme, for obtaining the approximate solution of optimal control governed by nonlinear Fredholm integral equations is presented. By converting the problem to a discretized form, it is considered as a quasi-assignment problem and then an iterative method is applied to find an approximate solution for the discretized form of the integral equation. An analysis for convergence of the proposed iterative method and its implementation for numerical examples are also given.  相似文献   

5.
基于蚁群算法的最优路径选择问题的研究   总被引:3,自引:0,他引:3  
夏立民  王华  窦倩  陈玲 《计算机工程与设计》2007,28(16):3957-3959,4058
交通网络中最优路径的选择尤为重要,各国学者在这方面做了大量的研究和改进.提出了一种基于蚁群算法的最优路径选择问题的新方法.在最优路径的选择过程中采用蚁群算法并对其进行建模,能够发挥算法并行性、正反馈、协作性等特点,使各蚂蚁个体之间相互协作,在较短的时间内发现较优解.研究及模拟实验结果表明,蚁群算法是一种鲁棒性较强的新型模拟仿生算法,具有较好的发展前景.  相似文献   

6.
针对区块链中工作量证明(PoW)共识机制下区块截留攻击导致的挖矿困境问题,将矿池间的博弈行为视作迭代的囚徒困境(IPD)模型,采用深度强化学习的策略梯度算法研究IPD的策略选择。利用该算法将每个矿池视为独立的智能体(Agent),将矿工的潜入率量化为强化学习中的行为分布,通过策略梯度算法中的策略网络对Agent的行为进行预测和优化,最大化矿工的人均收益,并通过模拟实验验证了策略梯度算法的有效性。实验发现,前期矿池处于相互攻击状态,平均收益小于1,出现了纳什均衡的问题;经过policy gradient算法的自我调整后,矿池由相互攻击转变为相互合作,每个矿池的潜入率趋于0,人均收益趋于1。实验结果表明,policy gradient算法可以解决挖矿困境的纳什均衡问题,最大化矿池人均收益。  相似文献   

7.
We consider a parallel machine scheduling problem with the objective of minimizing two types of costs: the cost related to production operations and the cost related to due date performances. The former could be reduced by reasonable settings of the operational variables (e.g., the number of workers, the frequency of maintenance), while the latter could be reduced by appropriate scheduling of the production process. However, the optimization of both targets is significantly complicated by the influence of human factors that play a dominant role in real‐world manufacturing systems. To cope with this issue, a simulation‐based optimization framework is adopted in this paper for obtaining high‐quality robust solutions to the integrated scheduling problem. Meanwhile, differential evolution, a metaheuristic algorithm based on swarm intelligence, is applied for a systematic search of the huge solution space. Finally, numerical computations are conducted to verify the effectiveness of the proposed approach. Sensitivity analysis and practical implications are also presented.  相似文献   

8.
In this paper, we present a modified gradient‐based algorithm for solving extended Sylvester‐conjugate matrix equations. The idea is from the gradient‐based method introduced in [14] and the relaxed gradient‐based algorithm proposed in [16]. The convergence analysis of the algorithm is investigated. We show that the iterative solution converges to the exact solution for any initial value based on some appropriate assumptions. A numerical example is given to illustrate the effectiveness of the proposed method and to test its efficiency and accuracy compared with those presented in [14] and [16].  相似文献   

9.
Data communication service has an important influence on e-commerce. The key challenge for the users is, ultimately, to select a suitable provider. However, in this article, we do not focus on this aspect but the viewpoint and decision-making of providers for order allocation and pricing policy when orders exceed service capacity. It is a multiple criteria decision-making problem such as profit and cancellation ratio. Meanwhile, we know realistic situations in which much of the input information is uncertain. Thus, it becomes very complex in a real-life environment. In this situation, fuzzy sets theory is the best tool for solving this problem. Our fuzzy model is formulated in such a way as to simultaneously consider the imprecision of information, price sensitive demand, stochastic variables, cancellation fee and the general membership function. For solving the problem, a new fuzzy programming is developed. Finally, a numerical example is presented to illustrate the proposed method. The results show that it is effective for determining the suitable order set and pricing policy of provider in data communication service with different quality of service (QoS) levels.  相似文献   

10.
In this study, which is both analytical and numerical, we compute the effective information horizon (EIH), i.e., the minimal time interval over which future information is relevant for optimal control and for measuring the performance of a single part‐type production system. Optimal control modeling and process solving, which consider aspects of decision making with limited forecast, are exemplified by a single part‐type production system. Specifically, the analysis reveals practical situations in which there is both a performance loss as well as feasibility violation when only information expected within the planning horizon is considered. The analysis is carried out by developing a pseudo‐stochastic model. We follow previous “pseudo‐stochastic” approaches that solve stochastic control problems by using deterministic, optimal control methods. However, we model the expected influences of all future events, including those that are beyond the planning horizon, as encapsulated by their density functions and not only by their mean values.  相似文献   

11.
This paper discusses the impact of a trade credit policy on alleviating conflicts arising on a dual‐channel supply chain that includes one manufacturer and one value‐added retailer. We use the Stackelberg game to model the problem and characterize optimal pricing strategies for each supply chain partner, examining different circumstances in terms of retail price and trade credit contracts. When a consistent price strategy is applied in the dual channels under conditions of an exogenous credit period, trade credit can help both partners to achieve win‐win situations in the following circumstances: (1) when the retail channel's market share is small and the retailer's interest rate is high; or (2) when the retail channel's market share is large and the retailer's interest rate is lower than the manufacturer's. The study also concludes that when an inconsistent price strategy is applied, a trade credit contract can alleviate channel conflicts when the retailer's interest rate is higher than the manufacturer's. Otherwise, the partners may terminate cooperation. However, when the manufacturer has the power to determine and set the credit period, trade credit cannot alleviate channel conflicts under consistent price and inconsistent price scenarios.  相似文献   

12.
13.
In this paper, a new class of two‐dimensional nonlinear variable‐order fractional optimal control problems (V‐OFOCPs) is introduced where the variable‐order fractional derivative is defined in the Caputo type. The general procedure for solving theses systems is expanding the state variable and the control variable based on the Legendre cardinal functions in the matrix form. Hence, we derive their operational matrix of derivative (OMD) and operational matrix of variable‐order fractional derivative (OMV‐OFD). More significantly, some properties of these basis functions are proved to be exploited in our approach. Using these achieved results, we simply expand the matrix form of the nonlinear performance index in terms of the Legendre cardinal functions and subsequently convert it to an algebraic equation. We emphasize that it is a valuable advantage of applying cardinal functions in approximation theory. Then, we implement the OMD and the OMV‐OFD of the Legendre cardinal functions to transform the variable‐order fractional dynamical system to a system of algebraic equations. Next, the method of constrained extremum is applied to adjoin the constraint equations including the given dynamical system and the initial‐boundary conditions to the performance index by a set of undetermined Lagrange multipliers. Finally, the necessary conditions of the optimality are derived as a system of nonlinear algebraic equations including the unknown coefficients of the state variable, the control variable and the Lagrange multipliers. The applicability and efficiency of the proposed approach are investigated through the various types of test problems.  相似文献   

14.
In this paper, we are interested in the problem of optimal control where the system is given by a fully coupled forward‐backward stochastic differential equation with a risk‐sensitive performance functional. As a preliminary step, we use the risk neutral which is an extension of the initial control system where the admissible controls are convex, and an optimal solution exists.Then, we study the necessary as well as sufficient optimality conditions for risk sensitive performance. At the end of this work, we illustrate our main result by giving an example that deals with an optimal portfolio choice problem in financial market, specifically the model of control cash flow of a firm or project where, for instance, we can set the model of pricing and managing an insurance contract.  相似文献   

15.
In this paper, a design problem of low dimensional disturbance observer‐based control (DOBC) is considered for a class of nonlinear parabolic partial differential equation (PDE) systems with the spatio‐temporal disturbance modeled by an infinite dimensional exosystem of parabolic PDE. Motivated by the fact that the dominant structure of the parabolic PDE is usually characterized by a finite number of degrees of freedom, the modal decomposition method is initially applied to both the PDE system and the PDE exosystem to derive a low dimensional slow system and a low dimensional slow exosystem, which accurately capture the dominant dynamics of the PDE system and the PDE exosystem, respectively. Then, the definition of input‐to‐state stability for the PDE system with the spatio‐temporal disturbance is given to formulate the design objective. Subsequently, based on the derived slow system and slow exosystem, a low dimensional disturbance observer (DO) is constructed to estimate the state of the slow exosystem, and then a low dimensional DOBC is given to compensate the effect of the slow exosystem in order to reject approximately the spatio‐temporal disturbance. Then, a design method of low dimensional DOBC is developed in terms of linear matrix inequality to guarantee that not only the closed‐loop slow system is exponentially stable in the presence of the slow exosystem but also the closed‐loop PDE system is input‐to‐state stable in the presence of the spatio‐temporal disturbance. Finally, simulation results on the control of temperature profile for catalytic rod demonstrate the effectiveness of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

16.
A new online iterative algorithm for solving the H control problem of continuous‐time Markovian jumping linear systems is developed. For comparison, an available offline iterative algorithm for converging to the solution of the H control problem is firstly proposed. Based on the offline iterative algorithm and a new online decoupling technique named subsystems transformation method, a set of linear subsystems, which implementation in parallel, are obtained. By means of the adaptive dynamic programming technique, the two‐player zero‐sum game with the coupled game algebraic Riccati equation is solved online thereafter. The convergence of the novel policy iteration algorithm is also established. At last, simulation results have illustrated the effectiveness and applicability of these two methods. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

17.
This paper investigates the problem of distributed reliable H consensus control for high‐order networked agent systems with actuator faults and switching undirected topologies. The Lipschitz nonlinearities, several types of actuator faults, and exogenous disturbances are considered in subsystems. Suppose the communication network of the multi‐agent systems may switch among finite connected graphs. By utilizing the relative state information of neighbors, a new distributed adaptive reliable consensus protocol is presented for actuator failure compensations in individual nodes. Note that the Lyapunov function for error systems may not decrease as the communication network is time‐varying; as a result, the existing distributed adaptive control technique cannot be applied directly. To overcome this difficulty, the topology‐based average dwell time approach is introduced to deal with switching jumps. By applying topology‐based average dwell time approach and Lyapunov theory, the distributed controller design condition is given in terms of LMIs. It is shown that the proposed scheme can guarantee that the reliable H consensus problem is solvable in the presence actuator faults and external disturbance. Finally, two numerical examples are given the effectiveness of the proposed theoretical results. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号