Markov decision Processes with fractional costs |
| |
Authors: | Zhiyuan Ren Krogh BH |
| |
Affiliation: | Signal Electron. & Embedded Syst. Lab., Gen. Electr. Global Res. Center, Niskayuna, NY, USA; |
| |
Abstract: | Certain methods for constructing embedded Markov decision processes (MDPs) lead to performance measures that are the ratio of two long-run averages. For such MDPs with finite state and action spaces and under an ergodicity assumption, this note presents algorithms for computing optimal policies based on policy iterations, linear programming, value iterations and Q-learning. |
| |
Keywords: | |
|
|