首页 | 本学科首页   官方微博 | 高级检索  
     

基于加权TextRank的中文自动文本摘要
引用本文:黄波,刘传才.基于加权TextRank的中文自动文本摘要[J].计算机应用研究,2020,37(2):407-410.
作者姓名:黄波  刘传才
作者单位:南京理工大学 计算机科学与工程学院,南京210094;南京理工大学 计算机科学与工程学院,南京210094
摘    要:现有中文自动文本摘要方法主要是利用文本自身信息,其缺陷是不能充分利用词语之间的语义相关等信息。鉴于此,提出了一种改进的中文文本摘要方法。此方法将外部语料库信息用词向量的形式融入到TextRank算法中,通过TextRank与word2vec的结合,把句子中每个词语映射到高维词库形成句向量。充分考虑了句子之间的相似度、关键词的覆盖率和句子与标题的相似度等因素,以此计算句子之间的影响权重,并选取排序最靠前的句子重新排序作为文本的摘要。实验结果表明,此方法在本文数据集中取得了较好的效果,自动提取中文摘要的效果比原方法好。

关 键 词:文本摘要  TextRank  词向量  句子相似度
收稿时间:2018/7/22 0:00:00
修稿时间:2018/9/14 0:00:00

Chinese automatic text summarization based on weighted TextRank
Huang Bo and Liu Chuancai.Chinese automatic text summarization based on weighted TextRank[J].Application Research of Computers,2020,37(2):407-410.
Authors:Huang Bo and Liu Chuancai
Affiliation:School of Computer Science and Engineering, Nanjing University of Science and Technology,
Abstract:The method of Chinese existing automatic text summarization mainly utilizes the text''s own information, and its defect is that it cannot make full use of the related semantic information between the words. Therefore, this paper proposed an improved Chinese text summarization method. This method integrated the information of the external corpora into the TextRank algorithm in the form of a word vector. Combined TextRank with word2vec, it mapped each word in the sentence to the highdimensional lexicon to form a sentence vector. This method fully considered the similarity between sentences, the coverage of keywords and the similarity between sentence and title to calculate the influence weights among sentences, and choose the top-ranked sentences used as the summarization of the text. The results of experiment show that this method has achieved good results in the data set of this paper, and is more effective than the original method in extracting Chinese summarization automatically.
Keywords:text summarization  TextRank  word vector  sentence similarity
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号