首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
针对含扩散项不可靠随机生产系统最优生产控制的优化命题, 采用数值解方法来求解该优化命题最优控制所满足的模态耦合的非线性偏微分HJB方程. 首先构造Markov链来近似生产系统状态演化, 并基于局部一致性原理, 把求解连续时间随机控制问题转化为求解离散时间的Markov决策过程问题, 然后采用数值迭代和策略迭代算法来实现最优控制数值求解过程. 文末仿真结果验证了该方法的正确性和有效性.  相似文献   

2.
随机运动目标搜索问题的最优控制模型   总被引:1,自引:0,他引:1  
提出了Rn空间中做布朗运动的随机运动目标的搜索问题的最优控制模型.采用分析的方法来研究随机运动目标的最优搜索问题,并将原问题转化为由一个二阶偏微分方程(HJB方程)所表示的确定性分布参数系统的等价问题,推导出随机运动目标的最优搜索问题的HJB方程,并证明了该方程的解即是所寻求的最优搜索策略.由此给出了一个计算最优搜索策略的算法和一个实例.  相似文献   

3.
Optimal portfolios with regime switching and value-at-risk constraint   总被引:1,自引:0,他引:1  
We consider the optimal portfolio selection problem subject to a maximum value-at-Risk (MVaR) constraint when the price dynamics of the risky asset are governed by a Markov-modulated geometric Brownian motion (GBM). Here, the market parameters including the market interest rate of a bank account, the appreciation rate and the volatility of the risky asset switch over time according to a continuous-time Markov chain, whose states are interpreted as the states of an economy. The MVaR is defined as the maximum value of the VaRs of the portfolio in a short time duration over different states of the chain. We formulate the problem as a constrained utility maximization problem over a finite time horizon. By utilizing the dynamic programming principle, we shall first derive a regime-switching Hamilton-Jacobi-Bellman (HJB) equation and then a system of coupled HJB equations. We shall employ an efficient numerical method to solve the system of coupled HJB equations for the optimal constrained portfolio. We shall provide numerical results for the sensitivity analysis of the optimal portfolio, the optimal consumption and the VaR level with respect to model parameters. These results are also used to investigating the effect of the switching regimes.  相似文献   

4.
A sufficient condition to solve an optimal control problem is to solve the Hamilton–Jacobi–Bellman (HJB) equation. However, finding a value function that satisfies the HJB equation for a nonlinear system is challenging. For an optimal control problem when a cost function is provided a priori, previous efforts have utilized feedback linearization methods which assume exact model knowledge, or have developed neural network (NN) approximations of the HJB value function. The result in this paper uses the implicit learning capabilities of the RISE control structure to learn the dynamics asymptotically. Specifically, a Lyapunov stability analysis is performed to show that the RISE feedback term asymptotically identifies the unknown dynamics, yielding semi-global asymptotic tracking. In addition, it is shown that the system converges to a state space system that has a quadratic performance index which has been optimized by an additional control element. An extension is included to illustrate how a NN can be combined with the previous results. Experimental results are given to demonstrate the proposed controllers.  相似文献   

5.
针对工业环境下无线传感器网络系统在受到外部较大干扰时的系统稳定性问题,提出Hamilton-JacobiBellman (HJB)方程与Minimax控制相结合的方法.首先,针对无线传感器网络在复杂工况环境下出现的网络时延和连续丢包有界的情况,给出具有时延和丢包的无线传感器网络系统模型;然后,在Minimax性能指标函数下,利用HJB方程设计系统的Minimax最优控制器,进一步通过检验函数得出有关最大干扰的表达形式,从而推导出系统稳定的充分条件;最后,通过数值算例和仿真验证系统在突发较大干扰时采用所提方法的可行性和有效性.  相似文献   

6.
This paper considers mobile to base station power control for lognormal fading channels in wireless communication systems within a centralized information stochastic optimal control framework. Under a bounded power rate of change constraint, the stochastic control problem and its associated Hamilton-Jacobi-Bellman (HJB) equation are analyzed by the viscosity solution method; then the degenerate HJB equation is perturbed to admit a classical solution and a suboptimal control law is designed based on the perturbed HJB equation. When a quadratic type cost is used without a bound constraint on the control, the value function is a classical solution to the degenerate HJB equation and the feedback control is affine in the system power. In addition, in this case we develop approximate, but highly scalable, solutions to the HJB equation in terms of a local polynomial expansion of the exact solution. When the channel parameters are not known a priori, one can obtain on-line estimates of the parameters and get adaptive versions of the control laws. In numerical experiments with both of the above cost functions, the following phenomenon is observed: whenever the users have different initial conditions, there is an initial convergence of the power levels to a common level and then subsequent approximately equal behavior which converges toward a stochastically varying optimum.  相似文献   

7.
An investment problem is considered with dynamic mean–variance(M-V) portfolio criterion under discontinuous prices which follow jump–diffusion processes according to the actual prices of stocks and the normality and stability of the financial market. The short-selling of stocks is prohibited in this mathematical model. Then, the corresponding stochastic Hamilton–Jacobi–Bellman(HJB) equation of the problem is presented and the solution of the stochastic HJB equation based on the theory of stochastic LQ control and viscosity solution is obtained. The efficient frontier and optimal strategies of the original dynamic M-V portfolio selection problem are also provided. And then, the effects on efficient frontier under the value-at-risk constraint are illustrated. Finally, an example illustrating the discontinuous prices based on M-V portfolio selection is presented.  相似文献   

8.
An approach to solve finite time horizon suboptimal feedback control problems for partial differential equations is proposed by solving dynamic programming equations on adaptive sparse grids. A semi-discrete optimal control problem is introduced and the feedback control is derived from the corresponding value function. The value function can be characterized as the solution of an evolutionary Hamilton–Jacobi Bellman (HJB) equation which is defined over a state space whose dimension is equal to the dimension of the underlying semi-discrete system. Besides a low dimensional semi-discretization it is important to solve the HJB equation efficiently to address the curse of dimensionality. We propose to apply a semi-Lagrangian scheme using spatially adaptive sparse grids. Sparse grids allow the discretization of the value functions in (higher) space dimensions since the curse of dimensionality of full grid methods arises to a much smaller extent. For additional efficiency an adaptive grid refinement procedure is explored. The approach is illustrated for the wave equation and an extension to equations of Schrödinger type is indicated. We present several numerical examples studying the effect the parameters characterizing the sparse grid have on the accuracy of the value function and the optimal trajectory.  相似文献   

9.
The purpose of this paper is to describe the application of the notion of viscosity solutions to solve the Hamilton-Jacobi-Bellman (HJB) equation associated with an important class of optimal control problems for quantum spin systems. The HJB equation that arises in the control problems of interest is a first-order nonlinear partial differential equation defined on a Lie group. Hence we employ recent extensions of the theory of viscosity solutions to Riemannian manifolds in order to interpret possibly non-differentiable solutions to this equation. Results from differential topology on the triangulation of manifolds are then used develop a finite difference approximation method for numerically computing the solution to such problems. The convergence of these approximations is proven using viscosity solution methods. In order to illustrate the techniques developed, these methods are applied to an example problem.  相似文献   

10.
The Hamilton-Jacobi-Bellman (HJB) equation corresponding to constrained control is formulated using a suitable nonquadratic functional. It is shown that the constrained optimal control law has the largest region of asymptotic stability (RAS). The value function of this HJB equation is solved for by solving for a sequence of cost functions satisfying a sequence of Lyapunov equations (LE). A neural network is used to approximate the cost function associated with each LE using the method of least-squares on a well-defined region of attraction of an initial stabilizing controller. As the order of the neural network is increased, the least-squares solution of the HJB equation converges uniformly to the exact solution of the inherently nonlinear HJB equation associated with the saturating control inputs. The result is a nearly optimal constrained state feedback controller that has been tuned a priori off-line.  相似文献   

11.
In this paper, an observer design is proposed for nonlinear systems. The Hamilton–Jacobi–Bellman (HJB) equation based formulation has been developed. The HJB equation is formulated using a suitable non-quadratic term in the performance functional to tackle magnitude constraints on the observer gain. Utilizing Lyapunov's direct method, observer is proved to be optimal with respect to meaningful cost. In the present algorithm, neural network (NN) is used to approximate value function to find approximate solution of HJB equation using least squares method. With time-varying HJB solution, we proposed a dynamic optimal observer for the nonlinear system. Proposed algorithm has been applied on nonlinear systems with finite-time-horizon and infinite-time-horizon. Necessary theoretical and simulation results are presented to validate proposed algorithm.  相似文献   

12.
We consider a problem of dynamic stochastic portfolio optimization modelled by a fully non-linear Hamilton–Jacobi–Bellman (HJB) equation. Using the Riccati transformation, the HJB equation is transformed to a simpler quasi-linear partial differential equation. An auxiliary quadratic programming problem is obtained, which involves a vector of expected asset returns and a covariance matrix of the returns as input parameters. Since this problem can be sensitive to the input data, we modify the problem from fixed input parameters to worst-case optimization over convex or discrete uncertainty sets both for asset mean returns and their covariance matrix. Qualitative as well as quantitative properties of the value function are analysed along with providing illustrative numerical examples. We show application to robust portfolio optimization for the German DAX30 Index.  相似文献   

13.
An optimal control problem is considered for a multi-degree-of-freedom (MDOF) system, excited by a white-noise random force. The problem is to minimize the expected response energy by a given time instantT by applying a vector control force with given bounds on magnitudes of its components. This problem is governed by the Hamilton-Jacobi-Bellman, or HJB, partial differential equation. This equation has been studied previously [1] for the case of a single-degree-of-freedom system by developing a hybrid solution. Specifically, an exact analitycal solution has been obtained within a certain outer domain of the phase plane, which provides necessary boundary conditions for numerical solution within a bounded in velocity inner domain, thereby alleviating problem of numerical analysis for an unbounded domain. This hybrid approach is extended here to MDOF systems using common transformation to modal coordinates. The multidimensional HJB equation is solved explicitly for the corresponding outer domain, thereby reducing the problem to a set of numerical solutions within bounded inner domains. Thus, the problem of bounded optimal control is solved completely as long as the necessary modal control forces can be implemented in the actuators. If, however, the control forces can be applied to the original generalized coordinates only, the resulting optimal control law may become unfeasible. The reason is the nonlinearity in maximization operation for modal control forces, which may lead to violation of some constraints after inverse transformation to original coordinates. A semioptimal control law is illustrated for this case, based on projecting boundary points of the domain of the admissible transformed control forces onto boundaries of the domain of the original control forces. Case of a single control force is considered also, and similar solution to the HJB equation is derived.  相似文献   

14.
In this paper we consider nonautonomous optimal control problems of infinite horizon type, whose control actions are given by L1-functions. We verify that the value function is locally Lipschitz. The equivalence between dynamic programming inequalities and Hamilton–Jacobi–Bellman (HJB) inequalities for proximal sub (super) gradients is proven. Using this result we show that the value function is a Dini solution of the HJB equation. We obtain a verification result for the class of Dini sub-solutions of the HJB equation and also prove a minimax property of the value function with respect to the sets of Dini semi-solutions of the HJB equation. We introduce the concept of viscosity solutions of the HJB equation in infinite horizon and prove the equivalence between this and the concept of Dini solutions. In the Appendix we provide an existence theorem.  相似文献   

15.
The asymmetric input-constrained optimal synchronization problem of heterogeneous unknown nonlinear multiagent systems(MASs) is considered in the paper. Intuitively,a state-space transformation is performed such that satisfaction of symmetric input constraints for the transformed system guarantees satisfaction of asymmetric input constraints for the original system. Then, considering that the leader’s information is not available to every follower, a novel distributed observer is designed to est...  相似文献   

16.
The Hamilton–Jacobi–Bellman (HJB) equation can be solved to obtain optimal closed-loop control policies for general nonlinear systems. As it is seldom possible to solve the HJB equation exactly for nonlinear systems, either analytically or numerically, methods to build approximate solutions through simulation based learning have been studied in various names like neurodynamic programming (NDP) and approximate dynamic programming (ADP). The aspect of learning connects these methods to reinforcement learning (RL), which also tries to learn optimal decision policies through trial-and-error based learning. This study develops a model-based RL method, which iteratively learns the solution to the HJB and its associated equations. We focus particularly on the control-affine system with a quadratic objective function and the finite horizon optimal control (FHOC) problem with time-varying reference trajectories. The HJB solutions for such systems involve time-varying value, costate, and policy functions subject to boundary conditions. To represent the time-varying HJB solution in high-dimensional state space in a general and efficient way, deep neural networks (DNNs) are employed. It is shown that the use of DNNs, compared to shallow neural networks (SNNs), can significantly improve the performance of a learned policy in the presence of uncertain initial state and state noise. Examples involving a batch chemical reactor and a one-dimensional diffusion-convection-reaction system are used to demonstrate this and other key aspects of the method.  相似文献   

17.
This paper studies mean maximization and variance minimization problems in finite horizon continuous-time Markov decision processes. The state and action spaces are assumed to be Borel spaces, while reward functions and transition rates are allowed to be unbounded. For the mean problem, we design a method called successive approximation, which enables us to prove the existence of a solution to the Hamilton-Jacobi-Bellman (HJB) equation, and then the existence of a mean-optimal policy under some growth and compact-continuity conditions. For the variance problem, using the first-jump analysis, we succeed in converting the second moment of the finite horizon reward to a mean of a finite horizon reward with new reward functions under suitable conditions, based on which the associated HJB equation for the variance problem and the existence of variance-optimal policies are established. Value iteration algorithms for computing mean- and variance-optimal policies are proposed.  相似文献   

18.
In this paper, we present an empirical study of iterative least squares minimization of the Hamilton-Jacobi-Bellman (HJB) residual with a neural network (NN) approximation of the value function. Although the nonlinearities in the optimal control problem and NN approximator preclude theoretical guarantees and raise concerns of numerical instabilities, we present two simple methods for promoting convergence, the effectiveness of which is presented in a series of experiments. The first method involves the gradual increase of the horizon time scale, with a corresponding gradual increase in value function complexity. The second method involves the assumption of stochastic dynamics which introduces a regularizing second derivative term to the HJB equation. A gradual reduction of this term provides further stabilization of the convergence. We demonstrate the solution of several problems, including the 4-D inverted-pendulum system with bounded control. Our approach requires no initial stabilizing policy or any restrictive assumptions on the plant or cost function, only knowledge of the plant dynamics. In the Appendix, we provide the equations for first- and second-order differential backpropagation.  相似文献   

19.
In this paper, fixed-final time optimal control laws using neural networks and HJB equations for general affine in the input nonlinear systems are proposed. The method utilizes Kronecker matrix methods along with neural network approximation over a compact set to solve a time-varying HJB equation. The result is a neural network feedback controller that has time-varying coefficients found by a priori offline tuning. Convergence results are shown. The results of this paper are demonstrated on an example.  相似文献   

20.
In this paper, an event-triggered safe control method based on adaptive critic learning (ACL) is proposed for a class of nonlinear safety-critical systems. First, a safe cost function is constructed by adding a control barrier function (CBF) to the traditional quadratic cost function; the optimization problem with safety constraints that is difficult to deal with by classical ACL methods is solved. Subsequently, the event-triggered scheme is introduced to reduce the amount of computation. Further, combining the properties of CBF with the ACL-based event-triggering mechanism, the event-triggered safe Hamilton–Jacobi–Bellman (HJB) equation is derived, and a single critic neural network (NN) framework is constructed to approximate the solution of the event-triggered safe HJB equation. In addition, the concurrent learning method is applied to the NN learning process, so that the persistence of excitation (PE) condition is not required. The weight approximation error of the NN and the states of the system are proven to be uniformly ultimately bounded (UUB) in the safe set with the Lyapunov theory. Finally, the availability of the presented method can be validated through the simulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号