首页 | 本学科首页   官方微博 | 高级检索  
     


A time-aware query-focused summarization of an evolving microblogging stream via sentence extraction
Affiliation:1. Huazhong University of Science and Technology, Wuhan, China;2. Chongqing University of Posts and Telecommunications, Chongqing, China;3. Huawei Technologies Co., Ltd, China
Abstract:With the number of social media users ramping up, microblogs are generated and shared at record levels. The high momentum and large volumes of short texts bring redundancies and noises, in which the users and analysts often find it problematic to elicit useful information of interest. In this paper, we study a query-focused summarization as a solution to address this issue and propose a novel summarization framework to generate personalized online summaries and historical summaries of arbitrary time durations. Our framework can deal with dynamic, perpetual, and large-scale microblogging streams. Specifically, we propose an online microblogging stream clustering algorithm to cluster microblogs and maintain distilled statistics called Microblog Cluster Vectors (MCV). Then we develop a ranking method to extract the most representative sentences relative to the query from the MCVs and generate a query-focused summary of arbitrary time durations. Our experiments on large-scale real microblogs demonstrate the efficiency and effectiveness of our approach.
Keywords:Microblog  Query-focused summarization  Computational linguistics  Sentence extraction  Personalized pagerank
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号