首页 | 本学科首页   官方微博 | 高级检索  
     


A successive approximation algorithm for an undiscounted Markov decision process
Authors:Ir J van der Wal
Affiliation:1. Department of Mathematics, Technological University Eindhoven, Insulindelaan 2, Eindhoven, The Netherlands
Abstract:In this paper we consider a completely ergodic Markov decision process with finite state and decision spaces using the average return per unit time criterion. An algorithm is derived which approximates the optimal solution. It will be shown that this algorithm is finite and supplies upper and lower bounds for the maximal average return and a nearly optimal policy with average return between these bounds.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号