首页 | 本学科首页   官方微博 | 高级检索  
     

Performance sensitivities for parameterized Markov systems
作者姓名:Xiren Cao  Junyu Zhang
作者单位:HongKongUniversityofScienceandTechnology,ClearWaterBay,Kowloon,HongKong
摘    要:It is known that the performance potentials (or equivalently, perturbation realization factors) can be used as building blocks for performance sensitivities of Markov systems. In parameterized systerns, the changes in parameters may only affect some states, and the explicit transition probability matrix may not be known. In this paper, we use an example to show that we can use potentials to construct performance sensitivities m a more flexible way; only the potentials at the affected states need to be estimated, and the transition probability matrix need not be known. Policy iteration algorithms, which are simpler than the standard one, can be established.

关 键 词:干扰分析  马尔可夫判决  迭代策略  增强学习
收稿时间:13 January 2004

Performance sensitivities for parameterized Markov systems
Xiren Cao,Junyu Zhang.Performance sensitivities for parameterized Markov systems[J].Journal of Control Theory and Applications,2004,2(1):65-68.
Authors:Xiren Cao  Junyu Zhang
Affiliation:Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Abstract:It is known that the performance potentials (or equivalentiy, perturbation realization factors) can be used as building blocks for performance sensitivities of Markov systems. In parameterized systems, the changes in parameters may only affect some states, and the explicit transition probability matrix may not be known. In this paper, we use an example to show that we can use potentials to construct performance sensitivities in a more flexible way; only the potentials at the affected states need to be estimated, and the transition probability matrix need not be known. Policy iteration algorithms, which are simpler than the standard one, can be established.
Keywords:Perturbation analysis  Markov decision processes  Policy iteration  Reinforcement learning  Perturbation realization
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
点击此处可从《控制理论与应用(英文版)》浏览原始摘要信息
点击此处可从《控制理论与应用(英文版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号