基于篇章结构多任务学习的神经机器翻译 Neural Machine Translation Based on Multi-task Learning of Discourse Structure期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于篇章结构多任务学习的神经机器翻译

引用本文：	亢晓勉,宗成庆. 基于篇章结构多任务学习的神经机器翻译[J]. 软件学报, 2022, 33(10): 3806-3818

作者姓名：	亢晓勉宗成庆

作者单位：	模式识别国家重点实验室(中国科学院自动化研究所), 北京 100190;中国科学院大学人工智能学院, 北京 100049

基金项目：	国家重点研发计划(2017YFB1002100);国家自然科学基金(U1836221)

摘要：	篇章翻译方法借助跨句的上下文信息以提升篇章的翻译质量.篇章具有结构化的语义信息,可以形式化地表示为基本篇章单元之间的依存关系.但是目前的神经机器翻译方法很少利用篇章的结构信息.为此,提出了一种篇章翻译模型,能够在神经机器翻译的编码器-解码器框架中显式地建模基本篇章单元切分、篇章依存结构预测和篇章关系分类任务,从而得到结构信息增强的篇章单元表示.该表示分别通过门控加权和层次注意力的方式,与编码和解码的状态向量进行融合.此外,为了缓解模型在测试阶段对篇章分析器的依赖,在训练时采用多任务学习的策略,引导模型对翻译任务和篇章分析任务进行联合优化.在公开数据集上的实验结果表明,所提出的方法能够有效地建模和利用篇章单元间的依存结构信息,从而达到提升译文质量的目的.
关键词：	神经机器翻译篇章结构多任务学习篇章分析
收稿时间：	2020-10-10
修稿时间：	2020-12-02
Neural Machine Translation Based on Multi-task Learning of Discourse Structure

KANG Xiao-Mian,ZONG Cheng-Qing. Neural Machine Translation Based on Multi-task Learning of Discourse Structure[J]. Journal of Software, 2022, 33(10): 3806-3818

Authors:	KANG Xiao-Mian ZONG Cheng-Qing

Affiliation:	National Laboratory of Pattern Recognition (Institute of Automation, Chinese Academy of Sciences), Beijing 100190, China;School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China

Abstract:	Document-level translation methods improve translation quality with cross-sentence contextual information. Document contains structural semantic information, which can be formally represented as dependency relations between elementary discourse units (EDUs). However, existing neural machine translation (NMT) methods seldom utilize discourse structural information. Therefore, this study proposes a document-level translation method that can explicitly model EDU segmentation, discourse dependency structure prediction, and discourse relation classification tasks in the encoder-decoder framework of NMT, so as to obtain the representation of EDU enhanced by structural information. The representation is integrated with the encoding and decoding state vectors by gating weighted fusion and hierarchical attention, respectively. In addition, in order to alleviate the dependence on discourse parsers at the inference phase, the multi-task learning strategy is applied to guide the joint optimization of translation and discourse analysis tasks. Experimental results on public datasets show that the proposed method can effectively model and utilize the dependency structural information between discourse units to improve the translation quality significantly.

Keywords:	neural machine translation discourse structure multi-task learning discourse analysis

	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏