基于改进的深度Q网络结构的商品推荐模型 Commodity recommendation model based on improved deep Q network structure期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于改进的深度Q网络结构的商品推荐模型

引用本文：	傅魁,梁少晴,李冰.基于改进的深度Q网络结构的商品推荐模型[J].计算机应用,2005,40(9):2613-2621.

作者姓名：	傅魁梁少晴李冰

作者单位：	武汉理工大学经济学院, 武汉 430070

基金项目：	教育部人文社会科学研究规划基金资助项目（17YJA870006）。

摘要：	传统推荐方法存在数据稀疏和特征识别差等问题，为了解决这些问题，根据隐式反馈构建具有时序性的正负反馈数据集。由于正负反馈数据集和商品购买具有强时序性特征，引入长短期记忆（LSTM）网络作为模型构件。考虑用户自身特征和用户动作选择回报由不同的输入数据决定，对竞争架构的深度Q网络进行改进，融合用户正负反馈和商品购买时序性，设计了基于改进的深度Q网络结构的商品推荐模型。模型对正负反馈数据进行区分性训练，对商品购买的时序性特征进行提取。在Retailrocket数据集上，与因子分解机（FM）模型、W&D模型和协同过滤（CF）模型中表现最好的相比，所提模型的准确率、召回率、平均准确率（MAP）和归一化折损累计增益（NDCG）分别提高了158.42%、89.81%、95.00%和67.57%。同时，使用DBGD作为探索方法，改善了推荐商品多样性低的缺陷。
关键词：	深度强化学习正负反馈数据集竞争网络架构长短期记忆网络商品推荐
收稿时间：	2019-11-25
修稿时间：	2020-01-12
Commodity recommendation model based on improved deep Q network structure

FU Kui,LIANG Shaoqing,LI Bing.Commodity recommendation model based on improved deep Q network structure[J].journal of Computer Applications,2005,40(9):2613-2621.

Authors:	FU Kui LIANG Shaoqing LI Bing

Affiliation:	School of Economics, Wuhan University of Technology, Wuhan Hubei 430070, China

Abstract:	Traditional recommendation methods have problems such as data sparsity and poor feature recognition. To solve these problems, positive and negative feedback datasets with time-series property were constructed according to implicit feedback. Since positive and negative feedback datasets and commodity purchases have strong time-series feature, Long Short-Term Memory (LSTM) network was introduced as the component of the model. Considering that the user’s own characteristics and action selection returns are determined by different input data, the deep Q network based on competitive architecture was improved: integrating the user positive and negative feedback and the time-series features of commodity purchases, a commodity recommendation model based on the improved deep Q network structure was designed. In the model, the positive and negative feedback data were trained differently, and the time-series features of the commodity purchases were extracted. On the Retailrocket dataset, compared with the best performance among the Factorization Machine (FM) model, W&D (Wide & Deep learning) and Collaborative Filtering (CF) models, the proposed model has the precision, recall, Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG) increased by 158.42%, 89.81%, 95.00% and 65.67%. At the same time, DBGD (Dueling Bandit Gradient Descent) was used as the exploration method, so as to improve the low diversity problem of recommended commodities.

Keywords:

	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏