首页 | 本学科首页   官方微博 | 高级检索  
     

基于多特征融合的TextRank新闻自动摘要模型
引用本文:徐飞,彭佳佳,刘军,杨博.基于多特征融合的TextRank新闻自动摘要模型[J].计算机系统应用,2023,32(2):242-249.
作者姓名:徐飞  彭佳佳  刘军  杨博
作者单位:西安工业大学 计算机科学与工程学院, 西安 710021;63768部队, 西安 710021
基金项目:新型网络与检测控制国家地方联合工程实验室基金(GSYSJ2018006); 陕西省教育厅专项科研计划(18JK0399)
摘    要:随着互联网的发展, 如何快速地从海量新闻中获取核心信息, 减少浏览负担, 是信息部门目前急需解决的问题. 现有的TextRank及其改进算法在新闻摘要抽取任务中, 考虑文本特征不全面. 在摘要句选择时, 只考虑到摘要的冗余度, 忽略了摘要的多样性及可读性. 针对上述问题, 本文提出了融合多特征的文本自动摘要方法MF-TextRank(multi-feature TextRank). 根据新闻的结构、句子和单词总结了更全面的文本特征信息用于改进TextRank算法的权重转移矩阵, 使句子权重计算更准确. 采用MMR算法更新句子权重, 通过集束搜索得到候选摘要集, 在MMR得分的基础上选择内聚性最高的候选摘要集作为最终的摘要输出. 实验结果表明, MF-TextRank算法在摘要抽取任务中摘要Rouge得分优于现有改进的TexRank算法, 有效提高了摘要抽取的准确性.

关 键 词:TextRank  MMR  Word2Vec  新闻摘要  多特征融合  自动摘要
收稿时间:2022/6/14 0:00:00
修稿时间:2022/7/12 0:00:00

Automatic News Summarization Model Based on Multi-feature TextRank
XU Fei,PENG Jia-Ji,LIU Jun,YANG Bo.Automatic News Summarization Model Based on Multi-feature TextRank[J].Computer Systems& Applications,2023,32(2):242-249.
Authors:XU Fei  PENG Jia-Ji  LIU Jun  YANG Bo
Abstract:With the development of the Internet, how to quickly obtain core information from massive news and make browsing easy has become an urgent problem for information departments. The existing TextRank and its improved algorithm fail to consider text features comprehensively in extracting news summaries. In selecting summaries, they only focus on the redundancy and ignore the diversity and readability of the summaries. In order to solve the above problems, this study proposes a multi-feature automatic text summarization method, namely, MF-TextRank. A more comprehensive text feature information is summarized according to the structure, sentences, and words of news, which is used to improve the weight transfer matrix of the TextRank algorithm and make the sentence weight calculation more accurate. Furthermore, an MMR algorithm is used to update sentence weight, and the candidate summary set is obtained by beam search. According to the MMR score, the candidate summary set with the highest cohesion is selected as the final summary for output. The experimental results show that the MF-TextRank algorithm outperforms the existing improved TextRank algorithm in extracting summaries and effectively improves the accuracy in this regard.
Keywords:TextRank  MMR algorithm  Word2Vec  news summary  multi-feature fusion  automatic summary
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号