首页 | 本学科首页   官方微博 | 高级检索  
     

基于词语对狄利克雷过程的时序摘要
引用本文:席耀一,李弼程,李天彩,黄山奇.基于词语对狄利克雷过程的时序摘要[J].自动化学报,2015,41(8):1452-1460.
作者姓名:席耀一  李弼程  李天彩  黄山奇
作者单位:1.解放军信息工程大学信息系统工程学院 郑州 450001;
基金项目:国家社会科学基金(14BXW028)资助
摘    要:时序摘要是按照时间顺序生成摘要, 对话题的演化发展进行概括. 已有的相关研究忽视或者不能准确发现句子中隐含的子话题信息. 针对该问题, 本文建立了一种新的主题模型, 即词语对狄利克雷过程, 并提出了一种基于该模型的时序摘要生成方法. 首先通过模型推理得到句子的子话题分布; 然后利用该分布计算句子的相关度和新颖度; 最后按时间顺序抽取与话题相关且新颖度高的句子组成时序摘要. 实验结果表明, 本文方法较目前的代表性研究方法生成了更高质量的时序摘要.

关 键 词:时序摘要    狄利克雷过程    词语对    主题模型
收稿时间:2015-01-04

Temporal Summarization Based on Biterm Dirichlet Process
XI Yao-Yi,LI Bi-Cheng,LI Tian-Cai,HUANG Shan-Qi.Temporal Summarization Based on Biterm Dirichlet Process[J].Acta Automatica Sinica,2015,41(8):1452-1460.
Authors:XI Yao-Yi  LI Bi-Cheng  LI Tian-Cai  HUANG Shan-Qi
Affiliation:1.Institute of Information System Engineering, PLA Information Engineering University, Zhengzhou 450001;2.Unit 65022, Shenyang 110162
Abstract:Temporal summarization aims at extracting sentences chronologically to give an overview about the evolution of a topic. Existing researches either neglect the information of latent subtopics, or fail to accurately discover them. In this paper, we develop a novel topic model called biterm Dirichlet process and generate the temporal summary based on it. Firstly, we get the subtopic distribution in each sentence through posterior inference. Secondly, we calculate each sentence's relevance and novelty degree according to its subtopic distribution. Finally, we chronologically extract the sentences which are relevant and novel to generate the temporal summary. Experiments demonstrate the better performance of our approach compared with currently representative methods.
Keywords:Temporal summarization  Dirichlet process  biterm  topic model
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号