首页 | 本学科首页   官方微博 | 高级检索  
     


Structure in the Space of Value Functions
Authors:Foster  David  Dayan  Peter
Affiliation:(1) Centre for Neuroscience, University of Edinburgh, Edinburgh, UK;;(2) Gatsby Computational Neuroscience Unit, London, UK;(3) Gatsby Computational Neuroscience Unit, London
Abstract:Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental fragments. We suggest how to find fragmentations using unsupervised, mixture model, learning methods on data derived from optimal value functions for multiple tasks, and show that these fragmentations are in accord with observable structure in the environments. Further, we present evidence that such fragments can be of use in a practical reinforcement learning context, by facilitating online, actor-critic learning of multiple goals MDPs.
Keywords:dynamic programming  value functions  reinforcement learning  unsupervised learning  density estimation  mixture models
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号