首页 | 本学科首页   官方微博 | 高级检索  
     

基于门控循环神经网络词性标注的蒙汉机器翻译研究
引用本文:刘婉婉,苏依拉,乌尼尔,仁庆道尔吉. 基于门控循环神经网络词性标注的蒙汉机器翻译研究[J]. 中文信息学报, 2018, 32(8): 68-74
作者姓名:刘婉婉  苏依拉  乌尼尔  仁庆道尔吉
作者单位:内蒙古工业大学 信息工程学院, 内蒙古 呼和浩特 010080
基金项目:国家自然科学基金(61363052,61502255);内蒙古自治区自然科学基金(2016MS0605),内蒙古自治区民族事务委员会基金(MW-2017-MGYWXXH-03)
摘    要:统计机器翻译可以通过统计方法预测出目标词,但没有充分理解原文语义关系,因而得到的译文质量不高。针对该问题,利用一种基于门控单元循环神经网络结构来对蒙汉神经机器翻译系统进行建模,引入注意力机制来获取双语词语的对齐信息,并在构建字典过程中对双语词语进行词性标注来强化语义,以此来缓解因欠训练导致的错译问题。实验结果表明,与RNN的基准系统和传统的统计机器翻译方法相比,该方法BLEU值得到一定的提升。

关 键 词:机器翻译  门控循环神经网络  注意力机制  对齐  

Mongolian-Chinese Machine Translation Research Based on Part of Speech Tagging with Gated Unit Neural Network
LIU Wanwan,SU Yila,Wunier,Renqingdaoerji. Mongolian-Chinese Machine Translation Research Based on Part of Speech Tagging with Gated Unit Neural Network[J]. Journal of Chinese Information Processing, 2018, 32(8): 68-74
Authors:LIU Wanwan  SU Yila  Wunier  Renqingdaoerji
Affiliation:Inner Mongolia University of Technology, College of Information Engineering, Hohhot, Inner Mongolia 010080, China
Abstract:Statistics machine translation may be able to predict a relatively accurate target word with statistical analysis method, but it cannot get a much better translation as it couldn’t fully understand the original semantic relations. To address this problem, the model of Mongolian-Chinese machine translation system is constructed using gated unit recurrent neural network structure, and introduce the global attention mechanism to obtain bilingual alignment information. In the process of constructing a dictionary, the bilingual words are annotated to strengthen the semantics, alleviating the problem caused by erroneous training. The research result shows that the BLEU value is certainly promoted and improved compared with previous benchmark research and traditional statistical machine translation method.
Keywords:machine translation    gated unit neural network    attention mechanism    alignment  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号