Approximate policy iteration: a survey and some new methods |
| |
Authors: | Dimitri P BERTSEKAS |
| |
Affiliation: | Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. |
| |
Abstract: | We consider the classical policy iteration method of dynamic programming(DP),where approximations and simulation are used to deal with the curse of dimensionality.We survey a number of issues:convergence and rate of convergence of approximate policy evaluation methods,singularity and susceptibility to simulation noise of policy evaluation,exploration issues,constrained and enhanced policy iteration,policy oscillation and chattering,and optimistic and distributed policy iteration.Our discussion of policy eva... |
| |
Keywords: | Dynamic programming Policy iteration Projected equation Aggregation Chattering Regularization |
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录! |
| 点击此处可从《控制理论与应用(英文版)》浏览原始摘要信息 |
|
点击此处可从《控制理论与应用(英文版)》下载全文 |
|