首页 | 本学科首页   官方微博 | 高级检索  
     

基于对比注意力机制的跨语言句子摘要系统
引用本文:殷明明,史小静,俞鸿飞,段湘煜. 基于对比注意力机制的跨语言句子摘要系统[J]. 计算机工程, 2020, 46(5): 86-93
作者姓名:殷明明  史小静  俞鸿飞  段湘煜
作者单位:苏州大学自然语言处理实验室,江苏苏州215006;苏州大学自然语言处理实验室,江苏苏州215006;苏州大学自然语言处理实验室,江苏苏州215006;苏州大学自然语言处理实验室,江苏苏州215006
基金项目:国家自然科学基金;国家重点研发计划
摘    要:当今句子摘要研究主要针对单语,即源端句子和目标端摘要短语属于同种语言,然而单语句子摘要严重制约了不同语言文本信息的快速获取。为解决该问题,提出一种跨语言句子摘要系统。借鉴回译思想,将单语句子摘要平行语料中的源端通过神经机器翻译系统翻译成另一种语言,将其与句子摘要平行语料中目标端的摘要短语共同构成跨语言的伪平行语料。在此基础上,利用对比注意力机制,实现目标端与源端序列中不相关信息的获取,解决了传统注意力机制中源端和目标端句子长度不匹配的问题。实验结果表明,与基于管道方法的单语句子摘要系统相比,该跨语言系统生成的摘要短语更流畅且符合人类语言表述方式,可达到接近单语的句子摘要水平。

关 键 词:跨语言句子摘要  平行语料  伪语料  对比注意力机制  管道方法

Cross-Lingual Sentence Summarization System Based on Contrastive Attention Mechanism
YIN Mingming,SHI Xiaojing,YU Hongfei,DUAN Xiangyu. Cross-Lingual Sentence Summarization System Based on Contrastive Attention Mechanism[J]. Computer Engineering, 2020, 46(5): 86-93
Authors:YIN Mingming  SHI Xiaojing  YU Hongfei  DUAN Xiangyu
Affiliation:(Natural Language Processing Laboratory,Soochow University,Suzhou,Jiangsu 215006,China)
Abstract:Nowadays,research in sentence summarization mainly focuses on monolingual materials,which means the source sentences and the target summarized phrases are in the same language,reducing the availability of information from texts in different languages.To solve the problem,this paper proposes a cross-lingual sentence summarization system.The system borrows the idea of back translation,using the neural machine translation system to translate the source end of parallel corpus of monolingual sentence summarization into another language.Then the translation is combined with summarized phrases in the target end of the parallel corpus of sentence summarization to construct a cross-lingual pseudo parallel corpus.On this basis,the contrastive attention mechanism is used to obtain most irrelevant information from the sequences of the source end and target end,solving the mismatching of lengths of source sentences and target sentences in the traditional attention mechanism.Experimental results show that compared with pipeline-based monolingual sentence summarization systems,the proposed cross-lingual system can generate more fluent summarized phrases that match the representation of human languages and are closer to the level of monolingual sentence summarization.
Keywords:cross-lingual sentence summarization  parallel corpus  pseudo corpus  contrastive attention mechanism  pipeline method
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号