首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进的深度Q网络结构的商品推荐模型
引用本文:傅魁,梁少晴,李冰.基于改进的深度Q网络结构的商品推荐模型[J].计算机应用,2005,40(9):2613-2621.
作者姓名:傅魁  梁少晴  李冰
作者单位:武汉理工大学 经济学院, 武汉 430070
基金项目:教育部人文社会科学研究规划基金资助项目(17YJA870006)。
摘    要:传统推荐方法存在数据稀疏和特征识别差等问题,为了解决这些问题,根据隐式反馈构建具有时序性的正负反馈数据集。由于正负反馈数据集和商品购买具有强时序性特征,引入长短期记忆(LSTM)网络作为模型构件。考虑用户自身特征和用户动作选择回报由不同的输入数据决定,对竞争架构的深度Q网络进行改进,融合用户正负反馈和商品购买时序性,设计了基于改进的深度Q网络结构的商品推荐模型。模型对正负反馈数据进行区分性训练,对商品购买的时序性特征进行提取。在Retailrocket数据集上,与因子分解机(FM)模型、W&D模型和协同过滤(CF)模型中表现最好的相比,所提模型的准确率、召回率、平均准确率(MAP)和归一化折损累计增益(NDCG)分别提高了158.42%、89.81%、95.00%和67.57%。同时,使用DBGD作为探索方法,改善了推荐商品多样性低的缺陷。

关 键 词:深度强化学习    正负反馈数据集    竞争网络架构    长短期记忆网络    商品推荐
收稿时间:2019-11-25
修稿时间:2020-01-12

Commodity recommendation model based on improved deep Q network structure
FU Kui,LIANG Shaoqing,LI Bing.Commodity recommendation model based on improved deep Q network structure[J].journal of Computer Applications,2005,40(9):2613-2621.
Authors:FU Kui  LIANG Shaoqing  LI Bing
Affiliation:School of Economics, Wuhan University of Technology, Wuhan Hubei 430070, China
Abstract:Traditional recommendation methods have problems such as data sparsity and poor feature recognition. To solve these problems, positive and negative feedback datasets with time-series property were constructed according to implicit feedback. Since positive and negative feedback datasets and commodity purchases have strong time-series feature, Long Short-Term Memory (LSTM) network was introduced as the component of the model. Considering that the user’s own characteristics and action selection returns are determined by different input data, the deep Q network based on competitive architecture was improved: integrating the user positive and negative feedback and the time-series features of commodity purchases, a commodity recommendation model based on the improved deep Q network structure was designed. In the model, the positive and negative feedback data were trained differently, and the time-series features of the commodity purchases were extracted. On the Retailrocket dataset, compared with the best performance among the Factorization Machine (FM) model, W&D (Wide & Deep learning) and Collaborative Filtering (CF) models, the proposed model has the precision, recall, Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG) increased by 158.42%, 89.81%, 95.00% and 65.67%. At the same time, DBGD (Dueling Bandit Gradient Descent) was used as the exploration method, so as to improve the low diversity problem of recommended commodities.
Keywords:
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号