基于状态软集结的相对值迭代算法 Relative Value Iteration Algorithm with Soft State Aggregation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于状态软集结的相对值迭代算法

引用本文：	胡光华,吴沧浦,乔治·瑟彬珂.基于状态软集结的相对值迭代算法[J].控制理论与应用,2000,17(3):415-418.

作者姓名：	胡光华吴沧浦乔治·瑟彬珂

作者单位：	1. 北京理工大学自动控制系北京,100081 2. 达特茅斯学院工学院,新罕尔布什州汉诺威,NH03755,美国

基金项目：	Foundationitem :supportedbytheNationalNaturalScienceFoundationofChina (696740 0 5) .

摘要：	在大规模随机控制问题中，值函数逼近是一种克服维数灾的方法，考虑平均模型马氏决策规划（ＭＤＰ）的状态软集结相对值迭代算法，在Ｓｐａｎ压缩的条件下，证明了该算法的收敛性，同时还给出了其误差估计。
关键词：	随机控制状态软集结相对值迭代算法
Relative Value Iteration Algorithm with Soft State Aggregation

HU Guanghua,WU Cangpu,George Cybenko.Relative Value Iteration Algorithm with Soft State Aggregation[J].Control Theory & Applications,2000,17(3):415-418.

Authors:	HU Guanghua WU Cangpu George Cybenko

Abstract:	A straightforward way to dispel the curse of dimensionality in large stochastic control problems is to replace the lookup table with a generalized function approximator such as state aggregation. The relative value iteration algorithm for average reward Markov decision processes (MDP) with soft state aggregation is investigated. Under a condition of the contraction with span semi norm, the convergence of the proposed algorithm is proved and an error bound of the approximation is also given.

Keywords:	dynamic programming Markov decision processes compact representation state aggregation average reward
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《控制理论与应用》浏览原始摘要信息

设为首页 | 免责声明 | 关于勤云 | 加入收藏