The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes |
| |
Authors: | Xi-Ren Cao |
| |
Affiliation: | (1) The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong |
| |
Abstract: | This paper provides an introductory discussion for an important concept, the performance potentials of Markov processes, and its relations with perturbation analysis (PA), average-cost Markov decision processes (MDP), Poisson equations, -potentials, the fundamental matrix, and the group inverse of the transition matrix (or the infinitesimal generators). Applications to single sample path-based performance sensitivity estimation and performance optimization are also discussed. On-line algorithms for performance sensitivity estimates and on-line schemes for policy iteration methods are presented. The approach is closely related to reinforcement learning algorithms. |
| |
Keywords: | Policy iterations Poisson equations -potentials" target="_blank">gif" alt="agr" align="BASELINE" BORDER="0">-potentials group inverse fundamental matrices on-line optimization reinforcement learning |
本文献已被 SpringerLink 等数据库收录! |
|