首页 | 官方网站   微博 | 高级检索  
     

基于改进Sequence-to-Sequence模型的文本摘要生成方法
引用本文:周健,田萱,崔晓晖.基于改进Sequence-to-Sequence模型的文本摘要生成方法[J].计算机工程与应用,2019,55(1):128-134.
作者姓名:周健  田萱  崔晓晖
作者单位:北京林业大学 信息学院,北京,100083;北京林业大学 信息学院,北京,100083;北京林业大学 信息学院,北京,100083
基金项目:中央高校基本科研业务费专项基金;中央高校基本科研业务费专项资金
摘    要:基于循环神经网络和注意力机制的Sequence-to-Sequence模型神经网络方法在信息抽取和自动摘要生成方面发挥了重要作用。然而,该方法不能充分利用文本的语言特征信息,且生成结果中存在未登录词问题,从而影响文本摘要的准确性和可读性。为此,利用文本语言特征改善输入的特性,同时引入拷贝机制缓解摘要生成过程未登录词问题。在此基础上,提出基于Sequence-to-Sequence模型的新方法 Copy-Generator模型,以提升文本摘要生成效果。采用中文摘要数据集LCSTS为数据源进行实验,结果表明所提方法能够有效地提高生成摘要的准确率,可应用于自动文本摘要提取任务。

关 键 词:文本摘要  Sequence-to-Sequence模型  语言特征  拷贝机制  Copy-Generator模型

Generation Method of Text Summarization Based on Advanced Sequence-to-Sequence Model
ZHOU Jian,TIAN Xuan,CUI Xiaohui.Generation Method of Text Summarization Based on Advanced Sequence-to-Sequence Model[J].Computer Engineering and Applications,2019,55(1):128-134.
Authors:ZHOU Jian  TIAN Xuan  CUI Xiaohui
Affiliation:School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
Abstract:The neural network method based on Sequence-to-Sequence model with Recurrent Neural Networks(RNN) and attention mechanism plays an important role in information extraction and automatic summary generation. However, this method cannot take full advantage of the linguistic features of text, and has the problem of out-of-vocabulary in the generated summarization, which influences the accuracy and readability of text summarization. To address the above problems, using text linguistics features to improve the input features, and introducing copy mechanism to alleviate the out-of-vocabulary problem in the process of summarization generation, this paper proposes a new method named Copy-Generator model based on Sequence-to-Sequence model to promote the generated summarization result. Taking the Chinese summarization dataset LCSTS as data source, the experimental results show that the proposed method can improve the accuracy of generated summarization, and can be applied to large-scale automatic text summarization task.
Keywords:text summarization  Sequence-to-Sequence model  linguistic feature  copy mechanism  Copy-Generator model  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号