Asynchronous Stochastic Approximation and Q-Learning |
| |
Authors: | Tsitsiklis John N. |
| |
Affiliation: | (1) Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, 02139 Cambridge, MA |
| |
Abstract: | We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available. |
| |
Keywords: | Reinforcement learning Q-learning dynamic programming stochastic approximation |
本文献已被 SpringerLink 等数据库收录! |