首页 | 本学科首页   官方微博 | 高级检索  
     

基于非连续短语的统计翻译模型研究
引用本文:张大鲲,张玮,冯元勇,孙乐.基于非连续短语的统计翻译模型研究[J].中文信息学报,2007,21(1):101-108.
作者姓名:张大鲲  张玮  冯元勇  孙乐
作者单位:中国科学院 软件研究所 中文信息处理中心,北京 100080
摘    要:目前统计机器翻译的主流方法仍然是基于短语的翻译模型。然而,该模型并没有考虑对非连续短语的处理。本文提出了一种基于非连续短语的统计翻译模型,利用该模型可以使翻译的基本单元从连续短语扩展到带有间隔的非连续短语,以更好地解决词语翻译时的上下文依赖问题。同时,由于该方法抽取的短语数量较少,也使得解码的效率得到了提高。实验表明,在效率提高的情况下,非连续短语模型可以取得与层次型短语模型相当的翻译结果。

关 键 词:人工智能  机器翻译  非连续短语  统计机器翻译  短语模型  
文章编号:1003-0077(2007)01-00101-08
收稿时间:2006-07-28
修稿时间:2006-10-20

Research on Non-contiguous Phrase-based Model for Statistical Machine Translation
ZHANG Da-kun,ZHANG Wei,FENG Yuan-yong,SUN Le.Research on Non-contiguous Phrase-based Model for Statistical Machine Translation[J].Journal of Chinese Information Processing,2007,21(1):101-108.
Authors:ZHANG Da-kun  ZHANG Wei  FENG Yuan-yong  SUN Le
Affiliation:Chinese Information Processing Center, Institute of Software, Chinese Academy of Sciences, Beijing 100080, China
Abstract:The phrase-based statistical machine translation model is still the most popular model nowadays.However,non-contiguous phrases are not taken into account in this model.A statistical machine translation model based on non-contiguous phrases is proposed in this paper.The units of translation are extended from contiguous phrases to phrases with intervals in order to take advantage of the context dependence.With the less numbers of phrases,the efficiency of the decoder in our model is also improved.Experiments show that with a better efficiency the translation results of our non-contiguous phrase-based model and hierarchical model are comparable.
Keywords:artificial intelligence  machine translation  non-contiguous phrase  statistical machine translation  phrase-based model
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号