首页 | 本学科首页   官方微博 | 高级检索  
     

基于状态软集结的相对值迭代算法
引用本文:胡光华,吴沧浦,乔治·瑟彬珂.基于状态软集结的相对值迭代算法[J].控制理论与应用,2000,17(3):415-418.
作者姓名:胡光华  吴沧浦  乔治·瑟彬珂
作者单位:1. 北京理工大学自动控制系北京,100081
2. 达特茅斯学院工学院,新罕尔布什州汉诺威,NH03755,美国
基金项目:Foundationitem :supportedbytheNationalNaturalScienceFoundationofChina (696740 0 5) .
摘    要:在大规模随机控制问题中,值函数逼近是一种克服维数灾的方法,考虑平均模型马氏决策规划(MDP)的状态软集结相对值迭代算法,在Span压缩的条件下,证明了该算法的收敛性,同时还给出了其误差估计。

关 键 词:随机控制  状态软集结  相对值  迭代算法

Relative Value Iteration Algorithm with Soft State Aggregation
HU Guanghua,WU Cangpu,George Cybenko.Relative Value Iteration Algorithm with Soft State Aggregation[J].Control Theory & Applications,2000,17(3):415-418.
Authors:HU Guanghua  WU Cangpu  George Cybenko
Abstract:A straightforward way to dispel the curse of dimensionality in large stochastic control problems is to replace the lookup table with a generalized function approximator such as state aggregation. The relative value iteration algorithm for average reward Markov decision processes (MDP) with soft state aggregation is investigated. Under a condition of the contraction with span semi norm, the convergence of the proposed algorithm is proved and an error bound of the approximation is also given.
Keywords:dynamic programming  Markov decision processes  compact representation  state aggregation  average reward
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《控制理论与应用》浏览原始摘要信息
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号