首页 | 本学科首页   官方微博 | 高级检索  
     

微博检索的研究进展
引用本文:卫冰洁,王斌,张帅,李鹏.微博检索的研究进展[J].中文信息学报,2015,29(2):10-23.
作者姓名:卫冰洁  王斌  张帅  李鹏
作者单位:1. 中国科学院 计算技术研究所,北京 100190;
2. 中国科学院 信息工程研究所,北京 100093;
3. 国家计算机网络应急技术处理协调中心,北京 100029
基金项目:科技支撑计划(2012BAH46B02)
摘    要:随着微博的快速发展,微博检索已经成为近年来研究领域的热点之一。该文首先以TREC Microblog数据为基础,从分析微博文档和微博查询两方面出发,得出微博检索与传统文本检索之间的两点不同: 一是微博文档相较于网页具有很多独有的特征;二是微博查询属于时间敏感查询,即在排序时除了考虑文本的语义相似度,还需要考虑时间因素,将这类方法统称为时间感知的检索技术。这两点差异使得已有的信息检索技术不能满足微博搜索的需求。该文主要介绍了近年来这两方面的相关研究: 首先描述了微博本身的多种特征以及基于这些特征提出的检索方法;然后以传统信息检索过程为主线,分别介绍了将时间信息用于文本表示、文档先验、查询扩展三方面的排序模型,最后总结了已有工作并且对未来研究内容进行了展望。

关 键 词:微博检索  时间信息  微博特性  文本表示  文档先验  查询扩展  

A Survey of Microblog Search
WEI Bingjie;WANG Bin;ZHANG Shuai;LI Peng.A Survey of Microblog Search[J].Journal of Chinese Information Processing,2015,29(2):10-23.
Authors:WEI Bingjie;WANG Bin;ZHANG Shuai;LI Peng
Affiliation:1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;
2. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;
3. National Computer Network Emergency Response Fechnical Team/Coordination Center of China, Bejing 100029, China
Abstract:With the rapid development of microblog, microblog retrieval has become one of the hot research areas in recent years. Firstly, in this paper, we analyze microblog documents and queries based on the TREC Microblog dataset. We found that, in contrast to traditional text retrieval, microblog search significantly differs in two ways. One is that microblog has its own characteristics compared to webpage. And the other is that microblog queries are time-sensitive, which means time information should be used in addition to traditional text similarity. According to these two differences, traditional text retrieval methods cannot be directly used in microblog search. Then, the related work on the two aspects of microblog retrieval is summarized. We described some microblog features and retrieval methods based on these features. According to the process of information retrieval, search models which use temporal information as the document priori or for query expansion or for text representation are also introduced. At last, we provide the conclusion and discuss the future work.
Keywords:microblog search  temporal information  microblog feature  text representation  document priori  query expansion  
本文献已被 CNKI 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号