首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
Approximate dynamic programming (ADP) is a general and effective approach for solving optimal control and estimation problems by adapting to uncertain and nonconvex environments over time.  相似文献   

4.
5.
In control systems theory, the Markov decision process (MDP) is a widely used optimization model involving selection of the optimal action in each state visited by a discrete-event system driven by Markov chains. The classical MDP model is suitable for an agent/decision-maker interested in maximizing expected revenues, but does not account for minimizing variability in the revenues. An MDP model in which the agent can maximize the revenues while simultaneously controlling the variance in the revenues is proposed. This work is rooted in machine learning/neural network concepts, where updating is based on system feedback and step sizes. First, a Bellman equation for the problem is proposed. Thereafter, convergent dynamic programming and reinforcement learning techniques for solving the MDP are provided along with encouraging numerical results on a small MDP and a preventive maintenance problem.  相似文献   

6.
征文通知     
《控制理论与应用》2010,27(1):132-132
Approximate dynamic programming(ADP)is a general and effective approach for solving optimal control and estimation problems by adapting to uncertain and non-convex environments over time.  相似文献   

7.
8.
9.
10.
11.
12.
This paper studies evolutionary programming and adopts reinforcement learning theory to learn individual mutation operators. A novel algorithm named RLEP (Evolutionary Programming based on Reinforcement Learning) is proposed. In this algorithm, each individual learns its optimal mutation operator based on the immediate and delayed performance of mutation operators. Mutation operator selection is mapped into a reinforcement learning problem. Reinforcement learning methods are used to learn optimal policies by maximizing the accumulated rewards. According to the calculated Q function value of each candidate mutation operator, an optimal mutation operator can be selected to maximize the learned Q function value. Four different mutation operators have been employed as the basic candidate operators in RLEP and one is selected for each individual in different generations. Our simulation shows the performance of RLEP is the same as or better than the best of the four basic mutation operators.  相似文献   

13.
We introduce a general discrete time dynamic framework to value pilot project investments that reduce idiosyncratic uncertainty with respect to the final cost of a project. The model generalizes different settings introduced previously in the literature by incorporating both market and technical uncertainty and differentiating between the commercial phase and the pilot phase of a project. In our model, the pilot phase requires NN stages of investment for completion. With this distinction we are able to frame the problem as a compound perpetual Bermudan option. We work in an incomplete markets setting where market uncertainty is spanned by tradable assets and technical uncertainty is idiosyncratic to the firm. The value of the option to invest as well as the optimal exercise policy are solved by an approximate dynamic programming algorithm that relies on the independence of the state variables increments. We prove the convergence of our algorithm and derive a theoretical bound on how the errors compound as the number of stages of the pilot phase is increased. We implement the algorithm for a simplified version of the model where revenues are fixed, providing an economic interpretation of the effects of the main parameters driving the model. In particular, we explore how the value of the investment opportunity and the optimal investment threshold are affected by changes in market volatility, technical volatility, the learning coefficient, the drift rate of costs and the time to completion of a pilot stage.  相似文献   

14.
This SCP special collects articles that make contributions to the foundations of aspect-oriented programming (AOP). Aspects have been developed over the last 10 years to facilitate the modularization of crosscutting concerns, i.e., concerns that crosscut with the primary modularization of a program. This special issue further continues the efforts of the annual FOAL workshop (Foundations of Aspect-Oriented Languages) in so far that it supports and integrates research on firm foundations of AOP. There are 5 contributions addressing the following issues: (i) a fundamental core language for aspects; (ii) subtleties of so-called around advice; (iii) aspects in higher-order languages; (iv) the interaction between aspects and generics; (v) a notion of aspects for reactive systems based on synchronous languages.  相似文献   

15.
16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号