结合用户长短期兴趣的深度强化学习推荐方法 Reinforcement Learning with User Long-term and Short-term Preference for Personalized Recommendation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

结合用户长短期兴趣的深度强化学习推荐方法

引用本文：	阎世宏,马为之,张敏,刘奕群,马少平.结合用户长短期兴趣的深度强化学习推荐方法[J].中文信息学报,2021,35(8):107-116.

作者姓名：	阎世宏马为之张敏刘奕群马少平

作者单位：	清华大学计算机系北京信息科学与技术国家研究中心,北京 100084

基金项目：	国家重点研发计划(2018YFC0831900);国家自然科学基金(61672311,61532011);清华大学国强研究院项目(2019GQG0004);中国博士后科学基金(2020M670339)

摘要：	结合强化学习(特别是深度强化学习)的推荐算法,在近年来相比已有方法取得了较大的提升.然而,现有绝大多数基于深度强化学习的推荐方法仅使用循环神经网络(RNN)等方法学习用户的短期兴趣,忽略了用户的长期兴趣,导致对用户的兴趣建模存在不足.因此,该文提出一种结合用户长期兴趣与短期兴趣的深度强化学习推荐方法(LSRL).首先,...
关键词：	推荐系统深度强化学习长期与短期兴趣协同过滤门控循环单元
收稿时间：	2020-10-30
Reinforcement Learning with User Long-term and Short-term Preference for Personalized Recommendation

YAN Shihong,MA Weizhi,ZHANG Min,LIU Yiqun,MA Shaoping.Reinforcement Learning with User Long-term and Short-term Preference for Personalized Recommendation[J].Journal of Chinese Information Processing,2021,35(8):107-116.

Authors:	YAN Shihong MA Weizhi ZHANG Min LIU Yiqun MA Shaoping

Affiliation:	Department of Computer Science and Technology, Institute for Artificial Intelligence, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China

Abstract:	Most of the existing recommendation methods based on deep reinforcement learning use recurrent neural network (RNN) to learn users' short-term preference, while ignoring their long-term preference. This paper proposes a deep reinforcement learning recommendation method combining both long-term and short-term user preference (LSRL). LSRL uses collaborative filtering to learn users' long-term preference representation and applies the gated recurrent unit (GRU) to learn user's short-term preference representation. The redesigned Q-network framework combines two types of representation and Deep Q-Network is used to predict users' feedback on items. Experimental results on MovieLens datasets show that the proposed method has a significant improvement according to NDCG and Hit Ratio compared to other baseline methods.

Keywords:	recommender system deep reinforcement learning long-term and short-term preference collaborative filtering gated recurrent unit

	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏