首页 | 本学科首页   官方微博 | 高级检索  
     


Continuous-Action Q-Learning
Authors:Millán  José del R  Posenato  Daniele  Dedieu  Eric
Affiliation:(1) Joint Research Centre, European Commission, 21020 Ispra (VA), Italy
Abstract:This paper presents a Q-learning method that works in continuous domains. Other characteristics of our approach are the use of an incremental topology preserving map (ITPM) to partition the input space, and the incorporation of bias to initialize the learning process. A unit of the ITPM represents a limited region of the input space and maps it onto the Q-values of M possible discrete actions. The resulting continuous action is an average of the discrete actions of the ldquowinning unitrdquo weighted by their Q-values. Then, TD(lambda) updates the Q-values of the discrete actions according to their contribution. Units are created incrementally and their associated Q-values are initialized by means of domain knowledge. Experimental results in robotics domains show the superiority of the proposed continuous-action Q-learning over the standard discrete-action version in terms of both asymptotic performance and speed of learning. The paper also reports a comparison of discounted-reward against average-reward Q-learning in an infinite horizon robotics task.
Keywords:reinforcement learning  incremental topology preserving maps  continuous domains  real-time operation
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号