首页 | 本学科首页   官方微博 | 高级检索  
     

有限语料汉蒙统计机器翻译调序方法研究
引用本文:陈雷,李淼,张健,曾伟辉. 有限语料汉蒙统计机器翻译调序方法研究[J]. 中文信息学报, 2013, 27(5): 198-205
作者姓名:陈雷  李淼  张健  曾伟辉
作者单位:中国科学院 合肥智能机械研究所,安徽 合肥 230031
基金项目:中国科学院信息化专项,国家自然科学基金资助项目
摘    要:自统计机器翻译技术出现以来,调序一直是语序差异显著的语言对互译系统中的关键问题,基于大规模语料训练的调序方法得到了广泛研究。目前汉蒙双语语料资源十分有限,使得现有的依赖于大规模语料和语言学知识的调序方法难以取得良好效果。该文对已有的相关研究进行了分析,提出了在有限语料条件下的汉蒙统计机器翻译调序方法。该方法依据语言学知识获取对译文语序影响显著的短语类型,研究这些短语类型的调序方案,并融入已有的调序模型实现调序的优化。实验表明该方法在有限语料条件下的效果提升显著。

关 键 词:统计机器翻译  调序  动词短语  有限语料  

Reordering for Chinese-Mongolian SMT Based on Small Parallel Corpus
CHEN Lei , LI Miao , ZHANG Jian , ZENG Weihui. Reordering for Chinese-Mongolian SMT Based on Small Parallel Corpus[J]. Journal of Chinese Information Processing, 2013, 27(5): 198-205
Authors:CHEN Lei    LI Miao    ZHANG Jian    ZENG Weihui
Affiliation:Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui 230031, China
Abstract:The reordering models are significant in reducing the difference of word orders between the language pairs in statistical machine translation. Most reordering approaches have high requirements of the scale of the parallel corpus in statistical machine translation. Chinese minority language resources are very scarce and difficult to achieve substantial growth in a short time. Therefore the current reordering approaches cannot play good effect in the translations between Chinese and minority languages. After analyzing the related studies, the paper proposes a source-side reordering method based on a small parallel corpus. In virtue of the linguistic knowledge, we analyzed both corpus and translations to obtain the verb phrases which affected the word orders of translations evidently. And then we studied the reordering rules of these verb phrases, including manually written rules and automatically extracted rules. Experiments show that our method can improve the performance of the state-of-the-art phrase translation models.
Key wordsstatistical machine translation; reordering; verb phrase; small parallel corpus
Keywords:statistical machine translation  reordering  verb phrase  small parallel corpus
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号