首页 | 本学科首页   官方微博 | 高级检索  
     

基于滑动窗口的微博时间线摘要算法
引用本文:徐伟赵斌,吉根林. 基于滑动窗口的微博时间线摘要算法[J]. 数据采集与处理, 2017, 32(3): 523-532
作者姓名:徐伟赵斌  吉根林
作者单位:南京师范大学计算机科学与技术学院, 南京, 210023
摘    要:时间线摘要是在时间维度上对文本进行内容归纳和概要生成的技术。传统的时间线摘要主要研究诸如新闻之类的长文本,而本文研究微博短文本的时间线摘要问题。由于微博短文本内容特征有限,无法仅依靠文本内容生成摘要,本文采用内容覆盖性、时间分布性和传播影响力3种指标评价时间线摘要,并提出了基于滑动窗口的微博时间线摘要算法(Microblog timeline summariaztion based on sliding window, MTSW)。该算法首先利用词项强度和熵来确定代表性词项;然后基于上述3种指标构建出评价时间线摘要的综合评价指标;最后采用滑动窗口的方法,遍历时间轴上的微博消息序列,生成微博时间线摘要。利用真实微博数据集的实验结果表明,MTSW算法生成的时间线摘要可以有效地反映热点事件发展演化的过程。

关 键 词:微博摘要  时间线摘要  短文本摘要  事件演化

Microblog Timeline Summarization Algorithm Based on Sliding Window
Xu Wei,Zhao Bin,Ji Genlin. Microblog Timeline Summarization Algorithm Based on Sliding Window[J]. Journal of Data Acquisition & Processing, 2017, 32(3): 523-532
Authors:Xu Wei  Zhao Bin  Ji Genlin
Affiliation:School of Computer Science and Technology, Nanjing Normal University, Nanjing, 210023, China
Abstract:Timeline summarization is the process of creating summaries towards topic information and development over time in natural language processing. Some algorithms are proposed to generate summaries towards long text like news, but seldom focus on timeline summaries of short text like microblog. Here, we propose a microblog timeline summarization based on sliding window (MTSW), which simultaneously incorporates content coverage, temporal distribution and influence to evaluate candidate timeline summaries. In the algorithm, representative terms are selected to represent microblog feature according to intensity of terms and entropy. We build a comprehensive indicator for evaluating the timeline summary based on the above three indicators. Then, we use sliding window to generate microblog timeline summary. Experiments on the real-world event datasets verify the effectiveness of the proposed method.
Keywords:microblog summary   timeline summary   short text summary   event evolution
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号