首页 | 本学科首页   官方微博 | 高级检索  
     

面向领域的高质量微博用户发现
引用本文:叶永君,李鹏,周美林,万仪方,王斌. 面向领域的高质量微博用户发现[J]. 中文信息学报, 2018, 32(7): 109-115
作者姓名:叶永君  李鹏  周美林  万仪方  王斌
作者单位:1.中国科学院 信息工程研究所,北京 100093;
2.中国科学院大学,北京 100049
基金项目:国家自然科学基金(61402466);国家高技术研究发展计划(863)(2015AA016005)
摘    要:在微博系统中,寻找高质量微博用户进行关注是获取高质量信息的前提。该文研究高质量微博用户发现问题,即给定领域词查询,系统根据用户质量返回相关用户排序列表。将该问题分解成两个子问题: 一是领域相关用户的检索问题,二是微博用户排序问题。针对用户检索问题,提出了基于用户标签的用户表示方法以及基于维基百科的查询—用户相似度匹配方法,该方法作为ESA(explicit semantic analysis)的一个扩展应用,结果具有良好的可解释性,实验表明基于维基百科的效果要优于基于其他资源的检索效果。针对用户排序问题,提出了基于图的迭代排序方法UBRank,在计算用户质量时同时考虑用户发布消息的数量和消息的权威度,并且只选择含URL的消息来构建图,实验验证了该方法的高效性和优越性。

关 键 词:用户质量测量  用户行为模型  图排序算法  

Domain Specific High-quality Microblogging User Detection
YE Yongjun,LI Peng,ZHOU Meilin,WAN Yifang,WANG Bin. Domain Specific High-quality Microblogging User Detection[J]. Journal of Chinese Information Processing, 2018, 32(7): 109-115
Authors:YE Yongjun  LI Peng  ZHOU Meilin  WAN Yifang  WANG Bin
Affiliation:1.Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;
2.University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:In nowadays microblogging systems such as Twitter or Weibo, searching high quality users to follow is essential for acquiring information. This paper is focuses on the task of high quality user identification, i.e., given a domain query, return a user list according to user quality. We divide the task into two sub-problems: the user search and the user ranking. As for the user search, we represent users according to their tags and propose a similarity-based retrieval approach using the Chinese Wikipedia, which is essentially an extension of the current ESA(explicit semantic analysis) method. As for the user ranking, we propose a graph-based ranking method called UBRank, which considers both the quantity and the quality of the published posts to measure the user importance. Experiments indicate that using Chinese Wikipedia is better than other resources such as HowNet, and validate the efficiency and superiority of the ranking method.
Keywords:user quality measure    user behavior model    graph based ranking  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号