基于统计的蒙汉机器翻译中词对齐方法研究 Research on Word Alignment in Mongolian-Chinese Statistical Machine Translation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于统计的蒙汉机器翻译中词对齐方法研究

引用本文：	苏依拉,赵亚平,牛向华. 基于统计的蒙汉机器翻译中词对齐方法研究[J]. 中文信息学报, 2018, 32(6): 44-51

作者姓名：	苏依拉赵亚平牛向华

作者单位：	内蒙古工业大学信息工程学院,内蒙古呼和浩特 010080

基金项目：	国家自然科学基金(61363052,61502255);内蒙古自治区自然科学基金(2016MS0605);内蒙古自治区民族事务委员会基金(MW-2017-MGYWXXH-03)

摘要：	蒙古语属于小语种,蒙古语到汉语机器翻译相关研究进展缓慢。所以,实现高质量的蒙汉机器翻译对我国少数民族地区信息化发展有着重要意义。其中,词语对齐对机器翻译质量起着至关重要的作用。该文提出了一种基于蒙古语切分的词干词缀为基本单位的蒙汉机器翻译词对齐方法。该方法利用词干词缀表和逆向最大匹配算法来实现蒙古语句子词干词缀的切分。实验结果表明对蒙古语进行词干词缀的切分能够显著提高对数线性词对齐模型的对齐质量。
关键词：	词对齐 IBM模型词干词缀切分对数线性模型
Research on Word Alignment in Mongolian-Chinese Statistical Machine Translation

SU Yila,ZHAO Yaping,NIU Xianghua. Research on Word Alignment in Mongolian-Chinese Statistical Machine Translation[J]. Journal of Chinese Information Processing, 2018, 32(6): 44-51

Authors:	SU Yila ZHAO Yaping NIU Xianghua

Affiliation:	College of Information Engineering, Inner Mongolia University of Technology, Hohhot, Inner Mongolia 010080, China

Abstract:	High-quality Mongolian to Chinese machine translation is of great significance to the development of IT in minority areas.To deal with the word alignment, which is a key issue in SMT,this paper proposes a Mongolian segmentation based on stems and affixes. To achieve this kind of basic unit of Mongolian Chinese word alignment, we use stems and affixes table and reverse maximum matching algorithm. The experiment results indicate that the proposed method can significantly improve the alignment quality.

Keywords:	word alignment IBM model affix and stem segment log linear model

	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏