首页 | 本学科首页   官方微博 | 高级检索  
     


Approximate dynamic programming with a fuzzy parameterization
Authors:Lucian Bu?oniu [Author Vitae],Damien Ernst [Author Vitae],Robert Babu&scaron  ka [Author Vitae]
Affiliation:a Delft Center for Systems & Control, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands
b Marine & Transport Technology, Delft University of Technology, The Netherlands
c FNRS; Institut Montefiore, Univ. Liège, Sart-Tilman, Bldg. B28, Parking P32, B-4000 Liège, Belgium
Abstract:Dynamic programming (DP) is a powerful paradigm for general, nonlinear optimal control. Computing exact DP solutions is in general only possible when the process states and the control actions take values in a small discrete set. In practice, it is necessary to approximate the solutions. Therefore, we propose an algorithm for approximate DP that relies on a fuzzy partition of the state space, and on a discretization of the action space. This fuzzy Q-iteration algorithm works for deterministic processes, under the discounted return criterion. We prove that fuzzy Q-iteration asymptotically converges to a solution that lies within a bound of the optimal solution. A bound on the suboptimality of the solution obtained in a finite number of iterations is also derived. Under continuity assumptions on the dynamics and on the reward function, we show that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases. These properties hold both when the parameters of the approximator are updated in a synchronous fashion, and when they are updated asynchronously. The asynchronous algorithm is proven to converge at least as fast as the synchronous one. The performance of fuzzy Q-iteration is illustrated in a two-link manipulator control problem.
Keywords:Approximate dynamic programming   Fuzzy approximation   Value iteration   Convergence analysis
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号