首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 133 毫秒
1.
The limit case of high-frequency optimal processes is analyzed. An upper bound for the number of different constant control vectors constituting the optimal bang-bang control can be derived a priori from the structure of the problem. Therefore, the optimal solution can be easily determined via mathematical programming.  相似文献   

2.
Quadratic programming (QP) has previously been applied to the computation of optimal controls for linear systems with quadratic cost criteria. This paper extends the application of QP to non-linear problems through quasi-linearization and the solution of a sequence of linear-quadratic sub-problems whose solutions converge to the solution of the original non-linear problem. The method is called quasi-linearization-quadratic programming or Q-QP.

The principal advantage of the Q-QP method lies in the ease with which optimal controls can be computed when saturation constraints are imposed on the control signals and terminal constraints are imposed on the state vector. Use of a bounded-variable QP algorithm permits solution of constrained problems with a negligible increase in computing time over the corresponding unconstrained problems. Numerical examples show how the method can be applied to certain problems with non-analytic objective functions and illustrate the facility of the method on problems with constraints. The Q-QP method is shown to be competitive with other methods in computation time for unconstrained problems and to be essentially unaffected in speed for problems having saturation and terminal constraints  相似文献   

3.
This paper proposes firstly to use a neural network with a mixed structure for learning the system dynamics of a nonlinear plant, which consists of multilayer and recurrent structure. Since a neural network with a mixed structure can learn time series, it can learn the dynamics of a plant without knowing the plant order. Secondly, a novel method of synthesizing the optimal control is developed using the proposed neural network. Procedures are as follows: (1) Let a neural network with a mixed structure learn the unknown dynamics of a nonlinear plant with arbitrary order, (2) after the learning is completed, the network is expanded into an equivalent feedforward multilayer network, (3) it is shown that the gradient of a criterion functional to be optimized can be easily obtained from this multilayer network, and then (4) the optimal control is generated by applying any of the existing non-linear programming algorithm based on this gradient information. The proposed method is successfully applied to the optimal control synthesis problem of a nonlinear coupled vibratory plant with a linear quadratic criterion functional.  相似文献   

4.
This paper is presented, not so much as an end in itself concerning the solution of a definite engineering problem, but as an exploratory step toward the solution of the problem of optimal control of continuous time stochastic non-linear systems when only noisy observations of the state are available. In this paper, the problem of determining the optimal open loop control, when no observations at all are available, is treated by dynamic programming. The main results of the paper are to obtain the functional equation of dynamic programming and to present a quasi-linearization type of algorithm for its solution. The author's intention was to illustrate by these results that infinite dimensional function space is the most natural setting for stochastic non-linear problems. The results obtained give some insight into what can be expected in the more general case of noisy observations. In the Appendices an argument is presented to justify the proposed algorithm and an example is given for which an exact solution to the functional equation of dynamic programming can be obtained.  相似文献   

5.
李金娜  尹子轩 《控制与决策》2019,34(11):2343-2349
针对具有数据包丢失的网络化控制系统跟踪控制问题,提出一种非策略Q-学习方法,完全利用可测数据,在系统模型参数未知并且网络通信存在数据丢失的情况下,实现系统以近似最优的方式跟踪目标.首先,刻画具有数据包丢失的网络控制系统,提出线性离散网络控制系统跟踪控制问题;然后,设计一个Smith预测器补偿数据包丢失对网络控制系统性能的影响,构建具有数据包丢失补偿的网络控制系统最优跟踪控制问题;最后,融合动态规划和强化学习方法,提出一种非策略Q-学习算法.算法的优点是:不要求系统模型参数已知,利用网络控制系统可测数据,学习基于预测器状态反馈的最优跟踪控制策略;并且该算法能够保证基于Q-函数的迭代Bellman方程解的无偏性.通过仿真验证所提方法的有效性.  相似文献   

6.
Demand-paging systems are characterized as stochastic control processes, and optimal page replacement decisions are determined by means of dynamic programming. This approach is distinguished from others by its utilization of page structure information, which may be either supplied a priori or else dynamically learned. The main result is an optimal realizable solution for a general class of replacement problems. The resulting algorithm subsumes others (including “A0”) as special cases.  相似文献   

7.
Demand-paging systems are characterized as stochastic control processes, and optimal page replacement decisions are determined by means of dynamic programming. This approach is distinguished from others by its utilization of page structure information, which may be either supplied a priori or else dynamically learned. The main result is an optimal realizable solution for a general class of replacement problems. The resulting algorithm subsumes others (including “A0”) as special cases.  相似文献   

8.
This paper presents an approach for the constrained non-linear predictive control problem based on the input-output feedback linearization (IOFL) of a general non-linear system modelled by a discrete-time affine neural network model. Using the resulting linear system in the formulation of the original non-linear predictive control problem enables to restate the optimization problem as the minimization of a quadratic function, which solution can be found using reliable and fast quadratic programming (QP) routines. However, the presence of a non-linear feedback linearizing controller maps the original linear input constraints onto non-linear and state dependent constraints on the controller output, which invalidates a direct application of QP routines. In order to cope with this problem and still be able to use QP routines, an approximate method is proposed which simultaneously guarantees a feasible solution without constraints violation over the complete prediction horizon within a finite number of steps, while allowing only for a small performance degradation.  相似文献   

9.
In this paper, optimal control for stochastic linear quadratic singular neuro Takagi–Sugeno (T-S) fuzzy system with singular cost is obtained using genetic programming(GP). To obtain the optimal control, the solution of matrix Riccati differential equation (MRDE) is computed by solving differential algebraic equation (DAE) using a novel and nontraditional GP approach. The obtained solution in this method is equivalent or very close to the exact solution of the problem. Accuracy of the solution computed by GP approach to the problem is qualitatively better. The solution of this novel method is compared with the traditional Runge–Kutta (RK) method. A numerical example is presented to illustrate the proposed method.  相似文献   

10.
It is shown that a symmetry in an optimization problem induces a decomposition of the optimal feedback control law into factors. One factor can be calculated algebraically and depends only on the symmetry; the other factor corresponds to a lower dimensional optimization problem. This gives a priori information about the structure of the optimal feedback control law and indicates a possibly more efficient method for optimizing such systems.  相似文献   

11.
一类非线性两层规划问题的递阶优化解法   总被引:3,自引:0,他引:3       下载免费PDF全文
提出一种求解一类非线性两层规划问题的新方法.通过引入解耦向量将非线性两层规划问题分解为独立且易于求解的子问题,利用两级递阶结构第1级求解若干优化的子问题,而在第2级利用第1级求解的结果调整解耦向量.所提出的方法借助于分解一协调原理并按迭代方式最终求得问题的最优解.对于含整数的规划问题,通过连续化处理后也可按该方法方便地求解.算例表明所提出的算法是简便而有效的.  相似文献   

12.
In this journal, Pantoja has described a deterministic optimal control problem in which his stagewise Newton procedure yields an exact optimal solution whereas differential dynamic programming (DDP) does not. This problem is also quoted by Coleman and Liao (in another journal) as a correct instance with some emphasis on the advantage of Pantoja's procedure over DDP. Pantoja argues that the problem involves nonlinear dynamics in his terminal-cost problem formulation, and therefore DDP and stagewise Newton methods are different. The purpose of this paper is to show that, while for a general nonlinear optimal control problem DDP and Pantoja's method differ, his problem has a special structure such that it is a false example of this claim; more specifically, the reason is twofold. First, he made an obvious algebraic error in his computation. Second, his example is equivalent to a problem of linear dynamics and quadratic criterion (LQ in short). It is true that when a general LQ that involves quadratic stage costs is transformed to a terminal-cost problem, the nonlinear (quadratic) state dynamics would result from each quadratic stage cost of the LQ. Yet the LQ-solution procedure remains the same, i.e., with the same discrete (Riccati) recurrence equations that can be derived by classical dynamic programming. This means that DDP obtains the exact minimum point of the transformed terminal-cost criterion just as does the Newton method. Using a standard LQ of general type, we formally prove this equivalence in its terminal-cost version even with nonlinear state dynamics.  相似文献   

13.
An adaptive optimal scheduling and controller design is presented that attempts to improve the performance of beer membrane filtration over the ones currently obtained by operators. The research was performed as part of a large European research project called EU Cafe with the aim to investigate the potential of advanced modelling and control to improve the production and quality of food. Significant improvements are demonstrated in this paper through simulation experiments. Optimal scheduling and control comprises a mixed integer non-linear programming problem (MINLP). By making some suitable assumptions that are approximately satisfied in practice, we manage to significantly simplify the problem by turning it into an ordinary non-linear programming problem (NLP) for which solution methods are readily available. The adaptive part of our scheduler and controller performs model parameter adaptations. These are also obtained by solving associated NLP problems. During cleaning stages in between membrane filtrations enough time is available to solve the NLP problems. This allows for real-time implementation.  相似文献   

14.
In this paper, a new formulation for the optimal tracking control problem (OTCP) of continuous-time nonlinear systems is presented. This formulation extends the integral reinforcement learning (IRL) technique, a method for solving optimal regulation problems, to learn the solution to the OTCP. Unlike existing solutions to the OTCP, the proposed method does not need to have or to identify knowledge of the system drift dynamics, and it also takes into account the input constraints a priori. An augmented system composed of the error system dynamics and the command generator dynamics is used to introduce a new nonquadratic discounted performance function for the OTCP. This encodes the input constrains into the optimization problem. A tracking Hamilton–Jacobi–Bellman (HJB) equation associated with this nonquadratic performance function is derived which gives the optimal control solution. An online IRL algorithm is presented to learn the solution to the tracking HJB equation without knowing the system drift dynamics. Convergence to a near-optimal control solution and stability of the whole system are shown under a persistence of excitation condition. Simulation examples are provided to show the effectiveness of the proposed method.  相似文献   

15.
In this paper, the problem of intercepting a manoeuvring target within a fixed final time is posed in a non-linear constrained zero-sum differential game framework. The Nash equilibrium solution is found by solving the finite-horizon constrained differential game problem via adaptive dynamic programming technique. Besides, a suitable non-quadratic functional is utilised to encode the control constraints into a differential game problem. The single critic network with constant weights and time-varying activation functions is constructed to approximate the solution of associated time-varying Hamilton–Jacobi–Isaacs equation online. To properly satisfy the terminal constraint, an additional error term is incorporated in a novel weight-updating law such that the terminal constraint error is also minimised over time. By utilising Lyapunov's direct method, the closed-loop differential game system and the estimation weight error of the critic network are proved to be uniformly ultimately bounded. Finally, the effectiveness of the proposed method is demonstrated by using a simple non-linear system and a non-linear missile–target interception system, assuming first-order dynamics for the interceptor and target.  相似文献   

16.
A necessary and sufficient condition for the existence of a linear state feedback controller which will simultaneously stabilize a finite collection of single-input linear systems is presented. This condition can be checked numerically through the solution of a non-linear programming problem presented here and can be used to determine whether or not a particular simultaneous stabilization problem has a solution. If the stabilization problem has a solution, the method provides a means of constructing a suitable controller. The proposed method is very versatile. It is a simple matter to limit the size of elements of the feedback gain vector and thus ensure that the control action is not excessive. In addition, for simultaneous stabilization problems which have a range of possible solutions, the proposed method makes it possible to select one which yields good performance.  相似文献   

17.
Barr and Gilbert (1966, 1969 b) have presented computing algorithms for converting a brood class of optimal control problems (including minimum time, and fixed-time minimum fuel, energy and effort problems) to a sequence of optimal regulator problems, using a one dimensional search of the cost variable. These Barr and Gilbert algorithms, which use quadratic programming algorithms by the same authors (1969 a) to solve the resulting optimal regulator problems, are restricted to dynamic equations linear in state by virtue of using the convexity and compactness (Neustadt 1963) and contact function (Gilbert 1966) of the reachable set

This paper extends the above approach to a class of terminal cost optimal control problems similar to those considered by Barr and Gilbert (including quite general control constraints, but only allowing initial and final state constraints), having differential equations non-linear instate and control (where the convexity-compactness results do not hold), by converting each such problem to a sequence of optimal regulator problems, with non-linear differential equations. These, in turn, are solved by one of the author's earlier algorithms (Katz 1974) that makes use of the above convexity, compactness, and contact function results by repeatedly linearizing the regulator problems. The approach of this paper differs from that of Halkin (1964 b), in that Halkin directly linearizes the original problem (e.g. converting a non-linear minimum fuel problem to a linear minimum fuel problem) and then solves the linearized version by a doubly iterative procedure

The computing algorithm presented here is based on the definition of an appropriate approximate solution of the terminal cost problem. A local-minimum convergence proof is given, which is weak in the sense that it assumes convergence of the substep algorithm (Katz 1974) for non-linear optimal regulator problems, whose convergence has not been proved. A subsequent paper (Katz and Wachtor, to appear) shows good convergence of the (overall) terminal cost problem algorithm in examples having singular arcs, with no prior knowledge of the solution or its singular nature, other than an initial upper bound on the cost.  相似文献   

18.
Computing a numerical solution to a periodic optimal control problem can be difficult, especially when the period is unknown. A method of approximating a solution to a stochastic optimal control problem using Markov chains was developed in [Krawczyk, J. B. (2001). A Markovian approximated solution to a portfolio management problem. Information Technology for Economics and Management, 1, http://www.item.woiz.polsl.pl/issue/journal1.htm]. This paper describes the application of that method to a periodic optimal control problem formulated in [Gaitsgory, V. & Rossomakhine, S. (2006). Linear programming approach to deterministic long run average problems of optimal control. SIAM Journal on Control and Optimization, 44(6), 2006-2037]. As a result, approximately optimal feedback rules are computed that can control the system both on and off the optimal orbit.  相似文献   

19.
阐述离散时间最优控制的特点.对比3种求解离散时间最优控制的解法,即:1)用非线性规划求解离散时间最优控制;2)用无约束优化求解离散时间最优控制;3)动态规划及其数值解.1)和2)都适用于多维静态优化,计算效率较高,是高级方法.在名义上,3)为动态优化.实际上,3)为一维分段无约束静态优化,计算效率较低,是初级方法.本文并用数字实例进一步阐明动态规划及其数值解在求解方面较差,故动态规划及其数值解已失去实用价值.在求解离散时间最优控制问题方面,无法与非线性规划求解相匹敌.  相似文献   

20.
基于动态规划的约束优化问题多参数规划求解方法及应用   总被引:1,自引:0,他引:1  
结合动态规划和单步多参数二次规划, 提出一种新的约束优化控制问题多参数规划求解方法. 一方面能得到约束线性二次优化控制问题最优控制序列与状态之间的显式函数关系, 减少多参数规划问题求解的工作量; 另一方面能够同时求解得到状态反馈最优控制律. 应用本文提出的多参数二次规划求解方法, 建立无限时间约束优化问题状态反馈显式最优控制律. 针对电梯机械系统振动控制模型做了数值仿真计算.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号