首页 | 本学科首页   官方微博 | 高级检索  
     

基于Spark的混合推荐算法研究
引用本文:胡德敏,龚燕. 基于Spark的混合推荐算法研究[J]. 计算机应用研究, 2017, 34(12)
作者姓名:胡德敏  龚燕
作者单位:上海理工大学光电信息与计算机工程学院;上海理工大学计算机软件技术研究所,上海理工大学光电信息与计算机工程学院;
基金项目:国家自然科学基金项目(61170277);国家自然科学基 金项目(61472256);上海市教委科研创新重点项目(12zz137);上海市一流学科建设项目(S1201YLXK)
摘    要:随着电子商务的发展,基于协同过滤的推荐算法越来越受欢迎,与此同时,该算法的缺陷也越来越明显,如数据稀疏性、系统可扩展性等。另外传统的单机计算模型也难以满足海量数据的实时推荐需求。为此,提出一种利用Spark计算模型实现分布式推荐的方法。该推荐方法采用基于谱聚类和朴素贝叶斯的混合推荐算法,同时使用增量式更新,在不全部重新训练模型的基础上,对模型进行局部修改。实验结果表明,较传统的单机模式推荐算法,基于Spark计算模型的分布式推荐算法,在一定程度上克服了数据稀疏性,提高了系统的可扩展性,降低了系统的响应时间。

关 键 词:推荐算法;分布式计算;Spark;增量式更新
收稿时间:2016-08-11
修稿时间:2016-09-27

Research on hybrid recommendation algorithm based on Spark Technology
hudemin and gongyan. Research on hybrid recommendation algorithm based on Spark Technology[J]. Application Research of Computers, 2017, 34(12)
Authors:hudemin and gongyan
Affiliation:School of Optical-Electrical,
Abstract:Collaborative filtering-based recommender systems have become extremely popular in recent years, due to the development of e-commerce. By the way, it has some limitations such as sparsity, scalability. Moreover traditional stand-alone model is difficult to meet the needs of massive data in real-time recommendation. Therefore, a distributed recommendation method based on Spark computing model is proposed, which the theory is based on spectral clustering and Naive Bayes. In addition, the hybrid method used the increment update schemes to refresh the ratings and improve the precision of the system, without all the re-training model. The experimental results demonstrate that to be compared with traditional stand-alone mode recommendation algorithm, the distributed recommendation algorithm proposed overcomes sparsity and scalability problem to a certain extent and has higher scalability and reduces the response time of the system.
Keywords:Recommender System   Distributed Computation   Spark   Incremental Update
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号