首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度编码注意力的XLNet-Transformer汉-马低资源神经机器翻译优化方法
引用本文:占思琦,徐志展,杨威,谢抢来.基于深度编码注意力的XLNet-Transformer汉-马低资源神经机器翻译优化方法[J].计算机应用研究,2024,41(3):799-804+810.
作者姓名:占思琦  徐志展  杨威  谢抢来
作者单位:1. 江西科技学院信息工程学院;2. 江西科技学院协同创新中心大数据实验室
基金项目:江西省教育厅科学技术研究资助项目(GJJ2202613,GJJ212015);
摘    要:神经机器翻译(NMT)在多个领域应用中已取得显著成效,在大规模语料库上已充分论证其优越性。然而,在语料库资源不足的情形下,仍存在较大的改进空间。由于汉语-马来语(汉-马)平行语料的匮乏,直接导致了汉-马机器翻译的翻译效果不佳。为解决汉-马低资源机器翻译不理想的问题,提出了一种基于深度编码注意力和渐进式解冻的低资源神经机器翻译方法。首先,利用XLNet预训练模型重构编码器,在编码器中使用了XLNet动态聚合模块替代了传统编码层的输出方式,有效弥补了低资源汉-马语料匮乏的瓶颈;其次,在解码器中使用并行交叉注意力模块对传统编码-解码注意力进行了改进,提升了源词和目标词的潜在关系的捕获能力;最后,对提出模型采用渐进式解冻训练策略,最大化释放了模型的性能。实验结果表明,提出方法在小规模的汉-马数据集上得到了显著的性能提升,验证了方法的有效性,对比其他的低资源NMT方法,所提方法结构更为精简,并改进了编码器和解码器,翻译效果提升更加显著,为应对低资源机器翻译提供了有效的策略与启示。

关 键 词:神经网络  汉-马机器翻译  低资源  渐进式解冻  预训练
收稿时间:2023/8/3 0:00:00
修稿时间:2024/2/8 0:00:00

XlNET-Transformer optimization method for chinese-malay low-resource neural machine translation based on deep coding fusion
zhansiqi,xuzhizhan,yangwei and xieqianglai.XlNET-Transformer optimization method for chinese-malay low-resource neural machine translation based on deep coding fusion[J].Application Research of Computers,2024,41(3):799-804+810.
Authors:zhansiqi  xuzhizhan  yangwei and xieqianglai
Affiliation:Jiangxi University of Technology,,,
Abstract:Neural machine translation(NMT) has achieved remarkable results in applications in many fields, and it has fully demonstrated its superiority on large-scale corpora. However, there is still a huge room for improvement when there are insufficient corpus resources. The lack of a Chinese-Malay parallel corpus directly affects the translation effect of Chinese-Malay machine translation. In order to solve the problem of unsatisfactory Chinese-Malay low-resource machine translation, this paper proposed a low-resource neural machine translation method based on deep encoded attention and progressive unfreezing. Compared with other low-resource NMT methods, this method had a simpler structure, and improved the encoder and decode, resulting in a more significant enhancement in the translation effect. Firstly, this method reconstructed the encoder using the XLNet pre-training model and replaced the output mode of the traditional encoding layer with the XLNet dynamic aggregation module in order to effectively compensate for the bottleneck caused by the lack of Chinese-Malay corpus. Secondly, it improved the traditional encoding-decoding attention by using a parallel cross-attention module in the decoder, which enhanced the ability to capture the potential relationship between the source word and the target word. Finally, it adopted a progressive unfreezing training strategy to maximize the release of the model''s performance. The experimental results demonstrate that the proposed method significantly improves the performance on a small-scale Chinese-Malay dataset, thus confirming its effectiveness. The approach provides effective strategies and insights to cope with low-resource machine translation.
Keywords:neural network  Chinese-Malay machine translation  low resource  progressive unfreezing  pre-training
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号