首页 | 本学科首页   官方微博 | 高级检索  
     


Kernel-Based Reinforcement Learning
Authors:Ormoneit  Dirk  Sen   Śaunak
Affiliation:(1) Department of Computer Science, Stanford University, Stanford, CA, 94305-9010, USA.;(2) The Jackson Laboratory, Bar Harbor, ME, 04609, USA
Abstract:
We present a kernel-based approach to reinforcement learning that overcomes the stability problems of temporal-difference learning in continuous state-spaces. First, our algorithm converges to a unique solution of an approximate Bellman's equation regardless of its initialization values. Second, the method is consistent in the sense that the resulting policy converges asymptotically to the optimal policy. Parametric value function estimates such as neural networks do not possess this property. Our kernel-based approach also allows us to show that the limiting distribution of the value function estimate is a Gaussian process. This information is useful in studying the bias-variance tradeoff in reinforcement learning. We find that all reinforcement learning approaches to estimating the value function, parametric or non-parametric, are subject to a bias. This bias is typically larger in reinforcement learning than in a comparable regression problem.
Keywords:reinforcement learning  Markov decision process  kernel-based learning  kernel smoothing  local averaging  lazy learning
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号