Continuous-Action Q-Learning |
| |
Authors: | Millán José del R Posenato Daniele Dedieu Eric |
| |
Affiliation: | (1) Joint Research Centre, European Commission, 21020 Ispra (VA), Italy |
| |
Abstract: | This paper presents a Q-learning method that works in continuous domains. Other characteristics of our approach are the use of an incremental topology preserving map (ITPM) to partition the input space, and the incorporation of bias to initialize the learning process. A unit of the ITPM represents a limited region of the input space and maps it onto the Q-values of M possible discrete actions. The resulting continuous action is an average of the discrete actions of the winning unit weighted by their Q-values. Then, TD( ) updates the Q-values of the discrete actions according to their contribution. Units are created incrementally and their associated Q-values are initialized by means of domain knowledge. Experimental results in robotics domains show the superiority of the proposed continuous-action Q-learning over the standard discrete-action version in terms of both asymptotic performance and speed of learning. The paper also reports a comparison of discounted-reward against average-reward Q-learning in an infinite horizon robotics task. |
| |
Keywords: | reinforcement learning incremental topology preserving maps continuous domains real-time operation |
本文献已被 SpringerLink 等数据库收录! |