首页 | 本学科首页   官方微博 | 高级检索  
     


Basic Ideas for Event-Based Optimization of Markov Systems
Authors:Xi-Ren?Cao  author-information"  >  author-information__contact u-icon-before"  >  mailto:eecao@ee.ust.hk"   title="  eecao@ee.ust.hk"   itemprop="  email"   data-track="  click"   data-track-action="  Email author"   data-track-label="  "  >Email author
Affiliation:(1) Department of Electrical and Electronic Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Abstract:The goal of this paper is two-fold: First, we present a sensitivity point of view on the optimization of Markov systems. We show that Markov decision processes (MDPs) and the policy-gradient approach, or perturbation analysis (PA), can be derived easily from two fundamental sensitivity formulas, and such formulas can be flexibly constructed, by first principles, with performance potentials as building blocks. Second, with this sensitivity view we propose an event-based optimization approach, including the event-based sensitivity analysis and event-based policy iteration. This approach utilizes the special feature of a system characterized by events and illustrates how the potentials can be aggregated using the special feature and how the aggregated potential can be used in policy iteration. Compared with the traditional MDP approach, the event-based approach has its advantages: the number of aggregated potentials may scale to the system size despite that the number of states grows exponentially in the system size, this reduces the policy space and saves computation; the approach does not require actions at different states to be independent; and it utilizes the special feature of a system and does not need to know the exact transition probability matrix. The main ideas of the approach are illustrated by an admission control problem.Supported in part by a grant from Hong Kong UGC.
Keywords:perturbation analysis  Markov decision processes (MDPs)  POMDPs  performance potentials  policy iteration  policy gradients  aggregation
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号