首页 | 本学科首页   官方微博 | 高级检索  
     

基于词项—句子—文档三层图模型的多文档自动摘要
引用本文:熊娇,王明文,李茂西,万剑怡.基于词项—句子—文档三层图模型的多文档自动摘要[J].中文信息学报,2014,28(6):201-207.
作者姓名:熊娇  王明文  李茂西  万剑怡
作者单位:江西师范大学 计算机信息工程学院,江西 南昌 330022
基金项目:国家自然科学基金(61272212,61163006,61203313)
摘    要:应用图模型来研究多文档自动摘要是当前研究的一个热点,它以句子为顶点,以句子之间相似度为边的权重构造无向图结构。由于此模型没有充分考虑句子中的词项权重信息以及句子所属的文档信息,针对这个问题,该文提出了一种基于词项—句子—文档的三层图模型,该模型可充分利用句子中的词项权重信息以及句子所属的文档信息来计算句子相似度。在DUC2003和DUC2004数据集上的实验结果表明,基于词项—句子—文档三层图模型的方法优于LexRank模型和文档敏感图模型。

关 键 词:图模型  多文档自动摘要  句子相似度  词项—句子—文档图  

Multi-Document Summarization Based on the Term-Sentence-Document Tri-layer Graph Model
XIONG Jiao,WANG Mingwen,LI Maoxi,WAN Jianyi.Multi-Document Summarization Based on the Term-Sentence-Document Tri-layer Graph Model[J].Journal of Chinese Information Processing,2014,28(6):201-207.
Authors:XIONG Jiao  WANG Mingwen  LI Maoxi  WAN Jianyi
Affiliation:School of Computer Information Engineering, Jiangxi Normal University, Nanchang, Jiangxi 330022, China
Abstract:Graph model has been widely applied to document summarization by using sentence as the graph nodes, and the similarity between sentences as the weights of edge. However, the knowledge of terms and documents are neglected in this model. In this paper, we propose a tri-layer graph model based on the term, the sentence and the documentto make full use of knowledge when computing the similarity of sentences. The experimental results on the data sets of DUC2003 and DUC2004 show that the proposed model outperforms the state-of-the-art LexRank model and Document Sensitive Ranking model.
Keywords:graph model  multi-document summarization  the similarity of sentences  term-sentence-document graph  
本文献已被 CNKI 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号