A reward allocation method for reinforcement learning in stabilizing control tasks期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

A reward allocation method for reinforcement learning in stabilizing control tasks

Authors:	Shu Hosokawa Joji Kato Kazushi Nakano

Affiliation:	1. Department of Electronic Engineering, The University of Electro-Communications, 1-5-1 Chofu-ga-oka, Tokyo, Chofu, 182-8585, Japan

Abstract:	Reinforcement learning is an area of machine learning that does not require detailed teaching signals by a human, which is expected to be applied to real robots. In its application to real robots, the learning processes are required to be finished in a short learning period of time. A reinforcement learning method of model-free type has fast convergence speeds in the tasks such as Sutton’s maze problem that aims to reach the target state in a minimum time. However, these methods are difficult to learn task to keep a stable state as long as possible. In this study, we improve the reward allocation method for the stabilizing control tasks. In stabilizing control tasks, we use the Semi-Markov decision process as an environment model. The validity of our method is demonstrated through simulation for stabilizing control of an inverted pendulum.

Keywords:
本文献已被 SpringerLink 等数据库收录！