首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider the use of quadratic approximate value functions for stochastic control problems with input‐affine dynamics and convex stage cost and constraints. Evaluating the approximate dynamic programming policy in such cases requires the solution of an explicit convex optimization problem, such as a quadratic program, which can be carried out efficiently. We describe a simple and general method for approximate value iteration that also relies on our ability to solve convex optimization problems, in this case, typically a semidefinite program. Although we have no theoretical guarantee on the performance attained using our method, we observe that very good performance can be obtained in practice.Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
In the dynamic programming paradigm the value of an optimal solution is recursively defined in terms of optimal solutions to subproblems. Such dynamic programming definitions can be tricky and error‐prone to specify. This paper presents an elegant method based on tabled logic programming (TLP) that simplifies the specification of such dynamic programming solutions. Our method introduces a new mode declaration for tabled predicates. The arguments of each tabled predicate are divided into indexed and non‐indexed arguments so that tabled predicates can be regarded as functions: indexed arguments represent input values and non‐indexed arguments represent output values. The non‐indexed arguments in a tabled predicate can be further declared to be aggregated, for example, the minimum, so that while generating answers, the global table will dynamically maintain the smallest value for that argument. This mode‐declaration scheme, coupled with recursion, provides an easy‐to‐use method for dynamic programming: there is no need to define the value of an optimal solution recursively, as the definition of a general solution suffices. The optimal value as well as its corresponding concrete solution can be derived implicitly and automatically using tabled logic programming systems. Our experimental results show that mode declarations improve performance in solving dynamic programming problems on TLP systems. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

3.
We present a technique for approximate robust dynamic programming that is suitable for linearly constrained polytopic systems with piecewise affine cost functions. The approximation method uses polyhedral representations of the cost-to-go function and feasible set, and can considerably reduce the computational burden compared to recently proposed methods for exact robust dynamic programming [Bemporad, A., Borrelli, F., & Morari, M. (2003). Min-max control of constrained uncertain discrete-time linear systems. IEEE Transactions on Automatic Control, 48(9), 1600-1606; Diehl, M., & Björnberg, J. (2004). Robust dynamic programming for min-max model predictive control of constrained uncertain systems. IEEE Transactions on Automatic Control, 49(12), 2253-2257]. We show how to apply the method to robust MPC, and give conditions that guarantee closed-loop stability. We finish by applying the method to a state constrained tutorial example, a parking car with uncertain mass.  相似文献   

4.
In this article, we develop a semi-definite programming-based receding horizon control approach to the problem of dynamic hedging of European basket call options under proportional transaction costs. The hedging problem for a European call option is formulated as a finite horizon constrained stochastic control problem. This allows us to develop a receding horizon control approach that repeatedly solves semi-definite programmes on-line in order to dynamically hedge. This approach is competitive with Black–Scholes delta hedging in the one-dimensional case with no transaction costs, but it also applies to multi-dimensional options such as basket options, and can include transaction costs. We illustrate its effectiveness through a numerical example involving an option on a basket of five stocks.  相似文献   

5.
An efficient numerical solution scheme entitled adaptive differential dynamic programming is developed in this paper for multiobjective optimal control problems with a general separable structure. For a multiobjective control problem with a general separable structure, the “optimal” weighting coefficients for various performance indices are time-varying as the system evolves along any noninferior trajectory. Recognizing this prominent feature in multiobjective control, the proposed adaptive differential dynamic programming methodology combines a search process to identify an optimal time-varying weighting sequence with the solution concept in the conventional differential dynamic programming. Convergence of the proposed adaptive differential dynamic programming methodology is addressed.  相似文献   

6.
In this article, a novel iteration algorithm named two-stage approximate dynamic programming (TSADP) is proposed to seek the solution of nonlinear switched optimal control problem. At each iteration of TSADP, a multivariate optimal control problem is transformed to be a certain number of univariate optimal control problems. It is shown that the value function at each iteration can be characterised pointwisely by a set of smooth functions recursively obtained from TSADP, and the associated control policy, continuous control and switching control law included, is explicitly provided in a state-feedback form. Moreover, the convergence and optimality of TSADP is strictly proven. To implement this algorithm efficiently, neural networks, critic and action networks, are utilised to approximate the value function and continuous control law, respectively. Thus, the value function is expressed by the weights of critic networks pointwise. Besides, redundant weights are ruled out at each iteration to simplify the exponentially increasing computation burden. Finally, a simulation example is provided to demonstrate its effectiveness.  相似文献   

7.
In this paper, we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming (ADP) approach. A new ε-optimal control algorithm based on the iterative ADP approach is proposed which makes the performance index function converge iteratively to the greatest lower bound of all performance indices within an error according to ε within finite time. The optimal number of control steps can also be obtained by the proposed ε-optimal control algorithm for the situation where the initial state of the system is unfixed. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the ε-optimal control algorithm. Finally, a simulation example is given to show the results of the proposed method.  相似文献   

8.
In this paper, we introduce new methods for finding functions that lower bound the value function of a stochastic control problem, using an iterated form of the Bellman inequality. Our method is based on solving linear or semidefinite programs, and produces both a bound on the optimal objective, as well as a suboptimal policy that appears to works very well. These results extend and improve bounds obtained in a previous paper using a single Bellman inequality condition. We describe the methods in a general setting and show how they can be applied in specific cases including the finite state case, constrained linear quadratic control, switched affine control, and multi‐period portfolio investment. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

9.
本文针对智能车辆的行为决策问题, 设计了基于混合整数规划的智能车横纵向一体化滚动优化决策方法. 该方法首先将纵向车速表示为非整数, 将期望车道表示为整数控制量, 建立了混合整数智能车决策简化模型; 然后, 设计了横纵向一体化滚动优化决策方法, 决策出纵向车速和换道动作, 根据系统输出与非线性约束的时域关系证明 了优化问题的递归可行性并通过遗传算法求解非线性混合整数规划优化问题. 基于车辆动力学仿真软件veDYNA 和Simulink进行了联合仿真, 并在红旗E-HS3智能车上开展了实车试验, 结果表明, 本文提出的基于混合整数规划的 智能车横纵向一体化决策方法能够实现超车、避障、跟车、停车和弯道工况下的行为决策.  相似文献   

10.
为研究路口交通信号灯的实时最优控制问题,提出一种以最小化等待时间为目标的多阶段决策模型.该模型利用最短绿灯和红灯时间的结构特征,通过合理选择系统状态和控制变量压缩了模型规模,进而提出了前向动态规划算法以高效得到最优解.数值实验显示,对比于固定时长的周期性控制可以节省路口车辆的等待时间;对比基于混合整数规划的求解方法,可以提高求解效率,满足实时控制的要求.  相似文献   

11.
Optimal control of batch reactors by iterative dynamic programming   总被引:2,自引:0,他引:2  
Four batch reactor systems are chosen to examine the viability of using iterative dynamic programming (IDP) for highly nonlinear systems encountered by chemical engineers. The first system is mildly nonlinear and rapid convergence resulted with the use of only a single state grid point. The use of piecewise linear continuous control with 40 stages yielded better results that the use of 80 stages with piecewise constant control. The need for more than a single grid point for the other three systems led to a systematic study of the effects of the number of grid points, of the number of allowable values for control and of the region contraction factor on the convergence of IDP. In every case the global optimum could be obtained with reasonable computational effort, and no difficulties were encountered even with systems exhibiting several local optima. The use of stages of different length allowed a refined solution to be obtained with a reasonably small number of stages in the last example.  相似文献   

12.
We model a multiperiod, single resource capacity reservation problem as a dynamic, stochastic, multiple knapsack problem with stochastic dynamic programming. As the state space grows exponentially in the number of knapsacks and the decision set grows exponentially in the number of order arrivals per period, the recursion is computationally intractable for large-scale problems, including those with long horizons. Our goal is to ensure optimal, or near optimal, decisions at time zero when maximizing the net present value of returns from accepted orders, but solving problems with short horizons introduces end-of-study effects which may prohibit finding good solutions at time zero. Thus, we propose an approximation approach which utilizes simulation and deterministic dynamic programming in order to allow for the solution of longer horizon problems and ensure good time zero decisions. Our computational results illustrate the effectiveness of the approximation scheme.  相似文献   

13.
This technical note concerns the predictive control of discrete‐time linear models subject to state, input and avoidance polyhedral constraints. Owing to the presence of avoidance constraints, the optimization associated with the predictive control law is non‐convex, even though the constraints themselves are convex. The inclusion of the avoidance constraints in the predictive control law is achieved by the use of a modified version of a mixed‐integer programming approach previously derived in the literature. The proposed modification consists of adding constraints to ensure that linear segments of the system trajectories between consecutive sampling times do not cross existing obstacles. This avoids the significant extra computation that would be incurred if the sampling time was reduced to prevent these crossings. Simulation results show that the inclusion of these additional constraints successfully prevents obstacle collisions that would otherwise occur. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

14.
In this paper, a novel optimal control design scheme is proposed for continuous-time nonaffine nonlinear dynamic systems with unknown dynamics by adaptive dynamic programming (ADP). The proposed methodology iteratively updates the control policy online by using the state and input information without identifying the system dynamics. An ADP algorithm is developed, and can be applied to a general class of nonlinear control design problems. The convergence analysis for the designed control scheme is presented, along with rigorous stability analysis for the closed-loop system. The effectiveness of this new algorithm is illustrated by two simulation examples.  相似文献   

15.
齐驰  王轶 《控制与决策》2011,26(7):1091-1095
针对交通流模型的强非线性、不确定性等特点,提出了基于近似动态规划的交通流模型参数辨识算法.该算法具有自学习和自适应的特性,不依赖于被控对象的解析模型.严格的理论推导证明了这种参数辨识方案的收敛性,仿真结果验证了所提出算法的有效性.  相似文献   

16.
We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming (ADP) method. For linear impulse systems, the optimal objective function is shown to be a quadric form of the pre-impulse states. The ADP method provides solutions that iteratively converge to the optimal objective function. If an initial guess of the pre-impulse objective function is selected as a quadratic form of the pre-impulse states, the objective function iteratively converges to the optimal one through ADP. Though direct use of the quadratic objective function of the states within the ADP method is theoretically possible, the numerical singularity problem may occur due to the matrix inversion therein when the system dimensionality increases. A neural network based ADP method can circumvent this problem. A neural network with polynomial activation functions is selected to approximate the pr~impulse objective function and trained iteratively using the ADP method to achieve optimal control. After a successful training, optimal impulse control can be derived. Simulations are presented for illustrative purposes.  相似文献   

17.
基于混合逻辑动态模型的混杂系统预测控制   总被引:5,自引:1,他引:5  
针对过程工业控制对象的混杂特性,采用基于混合逻辑动态模型的预测控制策略。给出混杂系统的建模方法,并对其稳定性进行分析。仿真结果表明基于混合逻辑动态模型的预测控制能使混杂系统跟踪设定值并满足操作约束,为研究新一代复杂工业控制系统提供了新的思路。  相似文献   

18.
电梯上高峰动态规划分区控制方法的研究   总被引:1,自引:0,他引:1  
宗群  罗欣宇  王振世 《控制与决策》2002,17(Z1):781-784
研究电梯在上高峰模式下,采用动态规划方法进行动态分区的电梯群控方法.将乘客的乘梯时间和候梯时间作为目标函数,在上高峰模式下对电梯群控的各部电梯进行最优分配,达到节能、快速和提高电梯利用率的目的.通过几种群控算法的仿真比较,验证了基于动态规划方法进行动态分区电梯群控方法的良好效果.  相似文献   

19.
A new approach to the problem of matching two waveforms by dynamic warping based on consideration of the waveforms at several different resolutions is presented, and its advantages over single level dynamic programming are explained. An application to binocular stereo vision is discussed.  相似文献   

20.
混合动力汽车通常由内燃机和电池两种不同的动力源驱动,对于给定的功率需求,如何分配两种动力源的输出功率,使得整个循环的耗油量达到最小是混合动力系统控制表示法需要解决的问题.本文采用改进动态规划方法来优化两种动力源的输出功率,并用PSATv6.1进行了系统仿真.仿真结果表明,与开关式相比,该方法能有效的降低串联混合动力汽车...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号