首页 | 本学科首页   官方微博 | 高级检索  
     

融合语境分析的时序推特摘要方法
引用本文:于广川,贺瑞芳,刘洋,党建武.融合语境分析的时序推特摘要方法[J].软件学报,2017,28(10):2654-2673.
作者姓名:于广川  贺瑞芳  刘洋  党建武
作者单位:天津大学 计算机科学与技术学院, 天津 300350;天津市认知计算与应用重点实验室, 天津 300350,天津大学 计算机科学与技术学院, 天津 300350;天津市认知计算与应用重点实验室, 天津 300350,北京大学 信息科学技术学院, 北京 100871,天津大学 计算机科学与技术学院, 天津 300350;天津市认知计算与应用重点实验室, 天津 300350
基金项目:国家重点基础研究发展计划(973)(2013CB329301);国家自然科学基金(61472277)
摘    要:时序推特摘要是文本摘要任务中的一个重要分支,旨在从热点事件相关的海量推特流中总结出随时间演化的简要推特集,以帮助用户快速获取信息.推特作为当今最流行的社交媒体平台,其信息量爆发式的增长以及文本碎片的非结构性,使得单纯依赖文本内容的传统摘要方法不再适用.与此同时,社交媒体的新特性也为推特摘要带来了新的机遇.将推特流视作信号,剖析了其中的复杂噪声,提出融合推特流随时序变化的宏微观信号以及用户社交上下文语境信息的时序推特摘要新方法.首先,通过小波分析对推特流全局时序信息建模,实现某一关键词相关的热点子事件时间点检测;接着,融入推特流局部时序信息和用户社交信息建立推特的随机步图模型摘要框架,为每个热点子事件生成推特摘要.在算法评估过程中,对真实推特数据集进行了专家时间点和专家摘要的人工标注,实验结果表明了小波分析和融合了时序-社交上下文语境的图模型在时序推特摘要中的有效性.

关 键 词:时序推特摘要  时序特性  用户社交权威性  小波去噪  上下文图模型
收稿时间:2016/4/23 0:00:00
修稿时间:2016/8/29 0:00:00

Context Based Model for Temporal Twitter Summarization
YU Guang-Chuan,HE Rui-Fang,LIU Yang and DANG Jian-Wu.Context Based Model for Temporal Twitter Summarization[J].Journal of Software,2017,28(10):2654-2673.
Authors:YU Guang-Chuan  HE Rui-Fang  LIU Yang and DANG Jian-Wu
Affiliation:School of Computer Science and Technology, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China,School of Computer Science and Technology, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China,School of Information Science and Technology, Peking University, Beijing 100871, China and School of Computer Science and Technology, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Abstract:Temporal Twitter summarization is an important sub-task of text summarization, which aims to extract a concise tweet set with time, goes from a huge Twitter stream.It helps users quickly understand a specific event.As one of the most popular social media platforms, the explosive growth of Twitter information makes it difficult for users to find reliable and useful information.As tweets are short and highly unstructured, it makes traditional document summarization methods difficult to handle Twitter data.Meanwhile, Twitter also provides rich temporal-social context more than texts, bringing new opportunities.This paper considers Twitter stream as a kind of signal, and proposes a novel temporal Twitter summarization method by modeling macro-micro temporal context and social context through analyzing the complex noises hidden in signal.First, time points of hot sub-events are detected by modeling temporal context globally with wavelet analysis.Second, a novel random walk model is built on graph based unsupervised Twitter summarization framework, integrating both local temporal context and social user authority to generate summary for each sub-event time point.To evaluate the proposed framework, a real-world Twitter dataset, including expert time point and summary, is manually labeled.Experimental results show that wavelet analysis during hot sub-event time point detection and temporal-social context in Twitter summarization are both effective.
Keywords:temporal Twitter summarization  temporal context  social user authority  wavelet denoising  context based graph model
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号