A reward allocation method for reinforcement learning in stabilizing control tasks |
| |
Authors: | Shu Hosokawa Joji Kato Kazushi Nakano |
| |
Affiliation: | 1. Department of Electronic Engineering, The University of Electro-Communications, 1-5-1 Chofu-ga-oka, Tokyo, Chofu, 182-8585, Japan
|
| |
Abstract: | Reinforcement learning is an area of machine learning that does not require detailed teaching signals by a human, which is expected to be applied to real robots. In its application to real robots, the learning processes are required to be finished in a short learning period of time. A reinforcement learning method of model-free type has fast convergence speeds in the tasks such as Sutton’s maze problem that aims to reach the target state in a minimum time. However, these methods are difficult to learn task to keep a stable state as long as possible. In this study, we improve the reward allocation method for the stabilizing control tasks. In stabilizing control tasks, we use the Semi-Markov decision process as an environment model. The validity of our method is demonstrated through simulation for stabilizing control of an inverted pendulum. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|