Structure in the Space of Value Functions期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Structure in the Space of Value Functions

Authors:	Foster David Dayan Peter

Affiliation:	(1) Centre for Neuroscience, University of Edinburgh, Edinburgh, UK;;(2) Gatsby Computational Neuroscience Unit, London, UK;(3) Gatsby Computational Neuroscience Unit, London

Abstract:	Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental fragments. We suggest how to find fragmentations using unsupervised, mixture model, learning methods on data derived from optimal value functions for multiple tasks, and show that these fragmentations are in accord with observable structure in the environments. Further, we present evidence that such fragments can be of use in a practical reinforcement learning context, by facilitating online, actor-critic learning of multiple goals MDPs.

Keywords:	dynamic programming value functions reinforcement learning unsupervised learning density estimation mixture models
本文献已被 SpringerLink 等数据库收录！