首页 | 本学科首页   官方微博 | 高级检索  
     

基于MapReduce的微博用户搜索排名算法
引用本文:梁秋实,吴一雷,封磊. 基于MapReduce的微博用户搜索排名算法[J]. 计算机应用, 2012, 32(11): 2989-2993. DOI: 10.3724/SP.J.1087.2012.02989
作者姓名:梁秋实  吴一雷  封磊
作者单位:北京市科学技术研究院 北京市计算中心,北京 100012
基金项目:北京市科学技术研究院萌芽计划基金资助项目
摘    要:在微博搜索领域,单纯依赖于粉丝数量的搜索排名使刷粉行为有了可乘之机,通过将用户看作网页,将用户间的“关注”关系看作网页间的链接关系,使PageRank关于网页等级的基本思想融入到微博用户搜索,并引入一个状态转移矩阵和一个自动迭代的MapReduce工作流将计算过程并行化,进而提出一种基于MapReduce的微博用户搜索排名算法。在Hadoop平台上对该算法进行了实验分析,结果表明,该算法避免了用户排名单纯与其粉丝数量相关,使那些更具“重要性”的用户在搜索结果中的排名获得提升,提高了搜索结果的相关性和质量。

关 键 词:微博搜索  云计算  MapReduce编程模型  Hadoop平台/系统  PageRank算法  
收稿时间:2012-05-13
修稿时间:2012-06-28

User ranking algorithm for microblog search based on MapReduce
LIANG Qiu-shi,WU Yi-lei,FENG Lei. User ranking algorithm for microblog search based on MapReduce[J]. Journal of Computer Applications, 2012, 32(11): 2989-2993. DOI: 10.3724/SP.J.1087.2012.02989
Authors:LIANG Qiu-shi  WU Yi-lei  FENG Lei
Affiliation:Beijing Computing Center, Beijing Academy of Science and Technology, Beijing 100012, China
Abstract:When microblog users search someone, they would like to follow by keywords. Most service providers order their results list simply depending on the scale of followers. Unfortunately, this approach gives frauds quite a few opportunities to cheat the search engine. This paper, by regarding microblog users as Web pages, and the relationship between followers as the one between Web pages that linked each other, applied the basic idea of PageRank to rank microblog users. After introducing a state transition matrix and an auto iterative MapReduce workflow to parallel the computation steps, this paper described a user ranking algorithm for microblog search. As shown in the experiment by using Hadoop platform, the algorithm increases the difficulty to cheat search engines, makes more important users get better rankings, and improves the relevance and quality of search results.
Keywords:microblog search   cloud computing   MapReduce programming model   Hadoop platform/system   PageRank algorithm
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号