基于优势演员-评论家算法的强化自动摘要模型 Reinforced automatic summarization model based on advantage actor-critic algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于优势演员-评论家算法的强化自动摘要模型

引用本文：	杜嘻嘻,程华,房一泉.基于优势演员-评论家算法的强化自动摘要模型[J].计算机应用,2021,41(3):699-705.

作者姓名：	杜嘻嘻程华房一泉

作者单位：	华东理工大学信息科学与工程学院, 上海 200237

基金项目：	赛尔网络下一代互联网技术创新项目

摘要：	针对长文本自动摘要任务中抽取式模型摘要较为冗余，而生成式摘要模型时常有关键信息丢失、摘要不准确和生成内容重复等问题，提出一种面向长文本的基于优势演员-评论家算法的强化自动摘要模型（A2C-RLAS）。首先，用基于卷积神经网络（CNN）和循环神经网络（RNN）的混合神经网络的抽取器（extractor）来提取原文关键句；然后，用基于拷贝机制和注意力机制的重写器（rewriter）来精炼关键句；最后，使用强化学习的优势演员-评论家（A2C）算法训练整个网络，把重写摘要和参考摘要的语义相似性（BERTScore值）作为奖励（reward）来指导抽取过程，从而提高抽取器提取句子的质量。在CNN/Daily Mail数据集上的实验结果表明，与基于强化学习的抽取式摘要（Refresh）模型、基于循环神经网络的抽取式摘要序列模型（SummaRuNNer）和分布语义奖励（DSR）模型等模型相比，A2C-RLAS的最终摘要内容更加准确、语言更加流畅，冗余的内容有效减少，且A2C-RLAS的ROUGE和BERTScore指标均有提升。相较于Refresh模型和SummaRuNNer模型，A2C-RLAS模型的ROUGE-L值分别提高了6.3%和10.2%；相较于DSR模型，A2C-RLAS模型的F1值提高了30.5%。
关键词：	自动摘要模型抽取式摘要模型生成式摘要模型编码器-解码器强化学习优势演员-评论家算法
收稿时间：	2020-06-17
修稿时间：	2020-10-08
Reinforced automatic summarization model based on advantage actor-critic algorithm

DU Xixi,CHENG Hua,FANG Yiquan.Reinforced automatic summarization model based on advantage actor-critic algorithm[J].journal of Computer Applications,2021,41(3):699-705.

Authors:	DU Xixi CHENG Hua FANG Yiquan

Affiliation:	Institute of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China

Abstract:	The extractive summary model is relatively redundant and the abstractive summary model often loses key information and has inaccurate summary and repeated generated content in long text automatic summarization task. In order to solve these problems, a Reinforced Automatic Summarization model based on Advantage Actor-Critic algorithm (A2C-RLAS) for long text was proposed. Firstly, the key sentences of the original text were extracted by the extractor based on the hybrid neural network of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). Then, the key sentences were refined by the rewriter based on the copy mechanism and the attention mechanism. Finally, the Advantage Actor-Critic (A2C) algorithm in reinforcement learning was used to train the entire network, and the semantic similarity between the rewritten summary and the reference summary (BERTScore (Evaluating Text Generation with Bidirectional Encoder Representations from Transformers) value) was used as a reward to guide the extraction process, so as to improve the quality of sentences extracted by the extractor. The experimental results on CNN/Daily Mail dataset show that, compared with models such as Reinforcement Learning-based Extractive Summarization (Refresh) model, a Recurrent Neural Network based sequence model for extractive summarization (SummaRuNNer) and Distributional Semantics Reward (DSR) model, the A2C-RLAS has the final summary with content more accurate, language more fluent and redundant content effectively reduced, at the same time, A2C-RLAS has both the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) and BERTScore indicators improved. Compared to the Refresh model and the SummaRuNNer model, the ROUGE-L value of the A2C-RLAS model is increased by 6.3% and 10.2% respectively; compared with the DSR model, the F1 value of the A2C-RLAS model is increased by 30.5%.

Keywords:	automatic summary model extractive summary model abstractive summary model encoder-decoder reinforcement learning Advantage Actor-Critic (A2C) algorithm
本文献已被万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏