SSPQL: Stochastic shortest path-based Q-learning |
| |
Authors: | Woo Young Kwon Il Hong Suh Sanghoon Lee |
| |
Affiliation: | 1.Department of Electronics & Computer Engineering,Hanyang University,Seoul,Korea |
| |
Abstract: | ![]() Reinforcement learning (RL) has been widely used as a mechanism for autonomous robots to learn state-action pairs by interacting with their environment. However, most RL methods usually suffer from slow convergence when deriving an optimum policy in practical applications. To solve this problem, a stochastic shortest path-based Q-learning (SSPQL) is proposed, combining a stochastic shortest path-finding method with Q-learning, a well-known model-free RL method. The rationale is, if a robot has an internal state-transition model which is incrementally learnt, then the robot can infer the local optimum policy by using a stochastic shortest path-finding method. By increasing state-action pair values comprising of these local optimum policies, a robot can then reach a goal quickly and as a result, this process can enhance convergence speed. To demonstrate the validity of this proposed learning approach, several experimental results are presented in this paper. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|