首页 | 本学科首页   官方微博 | 高级检索  
     

融合多信息句子图模型的多文档摘要抽取
引用本文:蒋亚芳,严馨,徐广义,周枫,邓忠莹.融合多信息句子图模型的多文档摘要抽取[J].计算机工程与科学,2020,42(3):535-542.
作者姓名:蒋亚芳  严馨  徐广义  周枫  邓忠莹
作者单位:(1.昆明理工大学信息工程与自动化学院,云南 昆明 650500; 2.昆明理工大学云南省人工智能重点实验室,云南 昆明 650500; 3.云南南天电子信息产业股份有限公司,云南 昆明 650041)
摘    要:针对现有多文档抽取方法不能很好地利用句子主题信息和语义信息的问题,提出一种融合多信息句子图模型的多文档摘要抽取方法。首先,以句子为节点,构建句子图模型;然后,将基于句子的贝叶斯主题模型和词向量模型得到的句子主题概率分布和句子语义相似度相融合,得到句子最终的相关性,结合主题信息和语义信息作为句子图模型的边权重;最后,借助句子图最小支配集的摘要方法来描述多文档摘要。该方法通过融合多信息的句子图模型,将句子间的主题信息、语义信息和关系信息相结合。实验结果表明,该方法能够有效地改进抽取摘要的综合性能。

关 键 词:多文档摘要  句子贝叶斯主题模型  词向量  句子图模型  最小支配集  
收稿时间:2019-07-01
修稿时间:2019-09-11

Multi-document summarization extraction based on multi-information sentence graph model
JIANG Ya-fang,YAN Xin,XU Guang-yi,ZHOU Feng,DENG Zhong-ying.Multi-document summarization extraction based on multi-information sentence graph model[J].Computer Engineering & Science,2020,42(3):535-542.
Authors:JIANG Ya-fang  YAN Xin  XU Guang-yi  ZHOU Feng  DENG Zhong-ying
Affiliation:(1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology,Kunming 650500; 2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500; 3.Yunnan Nantian Electronic Information Industry Co.,Ltd.,Kunming  650041,China)  
Abstract:In view of the problem that the existing multi-document extraction method cannot make good use of sentence topic information and semantic information, this paper proposes a multi-document summarization extraction method that integrates multi-information sentence graph model. Firstly, a sentence graph model with sentences as nodes is constructed. Secondly, the Bayesian topic model based on sentences and the word vector model are combined to get the probability distribution of sentence topic and the semantic similarity of sentences, and the final relevance of sentences is obtained. The topic information and semantic information are used as the edge weights of the sentence graph model. Finally, the summary of the multi-document is described by the summary method of the minimum dominance set of the sentence graph. This method combines the topic information, semantic information and relationship information between sentences by integrating the multi-information sentence graph model. The experimental results show that the method can effectively improve the comprehensive performance of the summarization extraction.
Keywords:multi-document summarization  sentence Bayesian theme model  word vector  sentence graph model  minimum dominating set  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号