首页 | 本学科首页   官方微博 | 高级检索  
     

融合句法解析树的汉-越卷积神经机器翻译
引用本文:王振晗,何建雅琳,余正涛,文永华,郭军军,高盛祥.融合句法解析树的汉-越卷积神经机器翻译[J].软件学报,2020,31(12):3797-3807.
作者姓名:王振晗  何建雅琳  余正涛  文永华  郭军军  高盛祥
作者单位:昆明理工大学信息工程与自动化学院,云南昆明650500;云南省人工智能重点实验室昆明理工大学,云南昆明650500;云南省人工智能重点实验室昆明理工大学,云南昆明650500
基金项目:国家自然科学基金(61732005,61672271,61761026,61866020);云南省自然科学基金(2018FB04);云南省省级人才培养计划项目(KKSY201703005,KKSY201703015)
摘    要:神经机器翻译是目前应用最广泛的机器翻译方法,在语料资源丰富的语种上取得了良好的效果.但是在汉语-越南语这类缺乏双语数据的语种上表现不佳.考虑汉语和越南语在语法结构上的差异性,提出一种融合源语言句法解析树的汉越神经机器翻译方法,利用深度优先遍历得到源语言的句法解析树的向量化表示,将句法向量与源语言词嵌入相加作为输入,训练翻译模型.在汉-越语言对上进行了实验,相较于基准系统,获得了0.6个BLUE值的提高.实验结果表明,融合句法解析树可以有效提高在资源稀缺情况下机器翻译模型的性能.

关 键 词:神经机器翻译  资源稀缺  句法解析树
收稿时间:2019/4/24 0:00:00
修稿时间:2019/7/20 0:00:00

Chinese-Vietnamese Convolutional Neural Machine Translation with Incorporating Syntactic Parsing Tree
WANG Zhen-Han,HE Jian-Ya-Lin,YU Zheng-Tao,WEN Yong-Hu,GUO Jun-Jun,GAO Sheng-Xiang.Chinese-Vietnamese Convolutional Neural Machine Translation with Incorporating Syntactic Parsing Tree[J].Journal of Software,2020,31(12):3797-3807.
Authors:WANG Zhen-Han  HE Jian-Ya-Lin  YU Zheng-Tao  WEN Yong-Hu  GUO Jun-Jun  GAO Sheng-Xiang
Affiliation:School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;Yunnan Key Laboratory of Artificial Intelligence(Kunming University of Science and Technology), Kunming 650500, China
Abstract:Neural machine translation is the most widely used machine translation method at present, and has sound performance in languages with rich corpus resources. However, it does not work well in languages that lack of bilingual data, such as Chinese-Vietnamese. Taking the difference in grammatical structure between different languages into consideration, this study proposes a neural machine translation method that incorporates syntactic parse tree. In this method, a depth-first search is used to obtain the vectorized representation of the syntactic parse tree of the source language, and the translation model is trained by embedding the obtained vectors and the source language embedding as inputs. This method is implemented on Chinese-Vietnamese, language pair and achieves 0.6 BLUE values improvement compared to the baseline system. This experiment shows that the incorporating syntax parse tree can effectively improve the performance of the machine translation model under the resource scarcity.
Keywords:neural machine translation  low-resource  syntactic parse tree
本文献已被 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号