首页 | 本学科首页   官方微博 | 高级检索  
     


The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms
Authors:Koenig  Sven  Simmons  Reid G.
Affiliation:(1) School of Computer Science, Carnegie Mellon University, 15213-3890 Pittsburgh, PA, USA
Abstract:We analyze the complexity of on-line reinforcement-learning algorithms applied to goal-directed exploration tasks. Previous work had concluded that, even in deterministic state spaces, initially uninformed reinforcement learning was at least exponential for such problems, or that it was of polynomial worst-case time-complexity only if the learning methods were augmented. We prove that, to the contrary, the algorithms are tractable with only a simple change in the reward structure ("penalizing the agent for action executions") or in the initialization of the values that they maintain. In particular, we provide tight complexity bounds for both Watkins' Q-learning and Heger's Q-hat-learning and show how their complexity depends on properties of the state spaces. We also demonstrate how one can decrease the complexity even further by either learning action models or utilizing prior knowledge of the topology of the state spaces. Our results provide guidance for empirical reinforcement-learning researchers on how to distinguish hard reinforcement-learning problems from easy ones and how to represent them in a way that allows them to be solved efficiently.This research was supported in part by NASA under contract NAGW-1175. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of NASA or the U.S. government.
Keywords:action models  admissible and consistent heuristics  action-penalty representation  complexity, goal-directed exploration  goal-reward representation  on-line reinforcement learning  prior knowledge  reward structure  Q-hat-learning  Q-learning
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号