首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的文本自动摘要方案
引用本文:张克君,李伟男,钱榕,史泰猛,焦萌. 基于深度学习的文本自动摘要方案[J]. 计算机应用, 2019, 39(2): 311-315. DOI: 10.11772/j.issn.1001-9081.2018081958
作者姓名:张克君  李伟男  钱榕  史泰猛  焦萌
作者单位:北京电子科技学院计算机科学与技术系,北京100070;西安电子科技大学计算机科学与技术学院,西安710071;西安电子科技大学计算机科学与技术学院,西安,710071;北京电子科技学院计算机科学与技术系,北京,100070
基金项目:国家重点研发计划项目(2018YFB1004101)。
摘    要:针对自然语言处理(NLP)生成式自动摘要领域的语义理解不充分、摘要语句不通顺和摘要准确度不够高的问题,提出了一种新的生成式自动摘要解决方案,包括一种改进的词向量生成技术和一个生成式自动摘要模型。改进的词向量生成技术以Skip-Gram方法生成的词向量为基础,结合摘要的特点,引入词性、词频和逆文本频率三个词特征,有效地提高了词语的理解;而提出的Bi-MulRnn+生成式自动摘要模型以序列映射(seq2seq)与自编码器结构为基础,引入注意力机制、门控循环单元(GRU)结构、双向循环神经网络(BiRnn)、多层循环神经网络(MultiRnn)和集束搜索,提高了生成式摘要准确性与语句流畅度。基于大规模中文短文本摘要(LCSTS)数据集的实验结果表明,该方案能够有效地解决短文本生成式摘要问题,并在Rouge标准评价体系中表现良好,提高了摘要准确性与语句流畅度。

关 键 词:自然语言处理  生成式文本自动摘要  序列映射  自编码器  词向量  循环神经网络
收稿时间:2018-09-20
修稿时间:2018-11-14

Automatic text summarization scheme based on deep learning
ZHANG Kejun,LI Weinan,QIAN Rong,SHI Taimeng,JIAO Meng. Automatic text summarization scheme based on deep learning[J]. Journal of Computer Applications, 2019, 39(2): 311-315. DOI: 10.11772/j.issn.1001-9081.2018081958
Authors:ZHANG Kejun  LI Weinan  QIAN Rong  SHI Taimeng  JIAO Meng
Affiliation:1. Department of Computer Science and Technology, Beijing Electronic Science and Technology Institute, Beijing 100070, China;2. School of Computer Science and Technology, Xidian University, Xi'an Shaanxi 710071, China
Abstract:Aiming at the problems of inadequate semantic understanding, improper summary sentences and inaccurate summary in the field of Natural Language Processing (NLP) abstractive automatic summarization, a new automatic summary solution was proposed, including an improved word vector generation technique and an abstractive automatic summarization model. The improved word vector generation technology was based on the word vector generated by the skip-gram method. Combining with the characteristics of abstract, three word features including part of speech, word frequency and inverse text frequency were introduced, which effectively improved the understanding of words. The proposed Bi-MulRnn+ abstractive automatic summarization model was based on sequence-to-sequence (seq2seq) framework and self-encoder structure. By introducing attention mechanism, Gated Recurrent Unit (GRU) gate structure, Bi-directional Recurrent Neural Network (BiRnn) and Multi-layer Recurrent Neural Network (MultiRnn), the model improved the summary accuracy and sentence fluency of abstractive summarization. The experimental results of Large-Scale Chinese Short Text Summarization (LCSTS) dataset show that the proposed scheme can effectively solve the problem of abstractive summarization of short text, and has good performance in Rouge standard evaluation system, improving summary accuracy and sentence fluency.
Keywords:Natural Language Processing (NLP)  abstractive automatic text summarization  sequence to sequence (seq2seq)  self-encoder  word vector  Recurrent Neural Network (RNN)  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号