首页 | 本学科首页   官方微博 | 高级检索  
     

重复利用状态值的竞争深度Q网络算法
引用本文:张俊杰,张聪,赵涵捷. 重复利用状态值的竞争深度Q网络算法[J]. 计算机工程与应用, 2021, 57(4): 134-140. DOI: 10.3778/j.issn.1002-8331.2007-0125
作者姓名:张俊杰  张聪  赵涵捷
作者单位:武汉轻工大学 数学与计算机学院,武汉 430023
基金项目:国家自然科学基金面上项目;湖北省重大科技专项资助项目;武汉轻工大学引进(培养)人才科研启动项目;湖北省自然科学基金青年项目
摘    要:在使用反距离加权法(Inverse Distance Weighted method,IDW)对土壤重金属含量进行预测时,算法中的超参数一般由先验知识确定,一定程度上存在不确定性.针对这一问题,提出了一种状态值再利用的竞争深度Q学习网络算法以精确估计IDW的超参数.该算法在训练时,将每轮训练样本中的奖励值进行标准化后,...

关 键 词:状态值重利用  竞争深度Q学习网络  反距离加权法  超参数搜索

Dueling Deep Q Network Algorithm with State Value Reuse
ZHANG Junjie,ZHANG Cong,ZHAO Hanjie. Dueling Deep Q Network Algorithm with State Value Reuse[J]. Computer Engineering and Applications, 2021, 57(4): 134-140. DOI: 10.3778/j.issn.1002-8331.2007-0125
Authors:ZHANG Junjie  ZHANG Cong  ZHAO Hanjie
Affiliation:School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430023, China
Abstract:When using the Inverse Distance Weighted method(IDW) to predict the content of heavy metals in soil, the super parameters in the algorithm are generally determined by prior knowledge, and there is uncertainty to a certain extent. In order to solve this problem, a dueling deep Q-learning network algorithm for reusing state values is proposed to accurately estimate the hyper-parameters of IDW. In the training process, the reward value of each training sample is standardized and combined with the state value of Q network in Dueling-DQN to form a new total reward value, and then the total reward value is input into the Q network for learning, so as to enhance the internal relationship between state and action and make the algorithm more stable. Finally, this method is used to perform hyper-parameter search on the IDW, and compare experiments with several common deep learning algorithms. Experimental results show that the proposed RSV-DuDQN algorithm can make the model converge faster, improve the stability of the model, and get more accurate IDW parameter estimation.
Keywords:reuse of state values  dueling deep Q-learning network  Inverse Distance Weighted method(IDW)  hyper-parameter search  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号