首页 | 本学科首页   官方微博 | 高级检索  
     

基于融合CNN和Transformer的分离结构机器翻译模型
引用本文:葛君伟,涂兆昊,方义秋.基于融合CNN和Transformer的分离结构机器翻译模型[J].计算机应用研究,2022,39(2):432-435.
作者姓名:葛君伟  涂兆昊  方义秋
作者单位:重庆邮电大学软件工程学院
基金项目:国家自然科学基金面上项目(62072066)。
摘    要:针对基于Transformer的机器翻译模型中存在的运行效率不高、计算参数过大以及计算复杂度过高的问题,提出一种基于融合CNN和Transformer的分离结构机器翻译模型。首先,对于运行效率不高和计算参数过大的问题,使用计算注意力模块和归一化模块分离的结构保证堆叠多层结构的可复用性,提高运行效率和降低计算参数。其次,引入了卷积计算模块和原始自注意力模块进行融合,原始自注意力模块用于计算全局上下文语义关系,卷积计算模块用于计算局部上下文语义关系,降低模型的复杂度。与其他机器翻译模型在相同的数据集进行实验对比,实验结果表明,该模型的计算参数最低,效果也比其他模型表现得更好。

关 键 词:卷积注意力  模块分离  机器翻译
收稿时间:2021/7/12 0:00:00
修稿时间:2022/1/13 0:00:00

Separate structure machine translation model based on fusion of CNN and Transformer
Ge Junwei,Tu Zhaohao and FangYiqiu.Separate structure machine translation model based on fusion of CNN and Transformer[J].Application Research of Computers,2022,39(2):432-435.
Authors:Ge Junwei  Tu Zhaohao and FangYiqiu
Affiliation:(College of Software Engineering,Chongqing University of Posts&Telecommunications,Chongqing 400065,China)
Abstract:To address the problems of inefficient operation, excessive computational parameters, and high computational complexity in the Transformer-based machine translation model, this paper proposed a separate structure machine translation model based on fused CNN and Transformer. Firstly, for the problems of inefficient operation and excessive computational parameters, this paper used the structure of separating computational attention module and normalization module to ensure the reusability of stacked multilayer structure, improve the operation efficiency and reduce the computational parameters. Secondly, the model introduced the convolutional computation module and the original self-attentive module for fusion. This paper used the original self-attentive module to calculate the global contextual semantic relations and the convolutional computation module to calculate the local contextual semantic relations to reduce the complexity of the model. Experimental comparisons with other machine translation models on the same dataset show that the model has the lowest computational parameters and performs better than other models.
Keywords:convolutional attention  module separation  machine translation
本文献已被 维普 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号