首页 | 本学科首页   官方微博 | 高级检索  
     

基于短语的统计机器翻译中短语表的过滤
引用本文:狄萍,周宥良,贡正仙,周国栋.基于短语的统计机器翻译中短语表的过滤[J].计算机应用与软件,2011,28(5).
作者姓名:狄萍  周宥良  贡正仙  周国栋
作者单位:苏州大学计算机科学与技术学院,江苏,苏州,215006
基金项目:国家自然科学基金项目(60673041)
摘    要:大多数基于短语的统计机器翻译系统将任意连续的词串看作短语,并没有考虑短语的合理性。使用了C-value以及短语粘结度两种方法,有效地对短语表进行过滤,减小了搜索空间,同时还提高了翻译质量。实验表明,在翻译结果的BLEU评价提高0.02的情况下,短语表可以缩减为原来的78%。并且当短语表缩减为原来的47.5%时,BLEU评价仍提高了0.0158。

关 键 词:统计机器翻译  短语表过滤  C-value  短语粘结度  

PHRASE TABLE FILTRATION IN PHRASE-BASED STATISTICAL MACHINE TRANSLATION
Di Ping,Zhou Youliang,Gong Zhengxian,Zhou Guodong.PHRASE TABLE FILTRATION IN PHRASE-BASED STATISTICAL MACHINE TRANSLATION[J].Computer Applications and Software,2011,28(5).
Authors:Di Ping  Zhou Youliang  Gong Zhengxian  Zhou Guodong
Affiliation:Di Ping Zhou Youliang Gong Zhengxian Zhou Guodong(School of Computer Science and Technology,Soochow University,Suzhou 215006,Jiangsu,China)
Abstract:Most phrase-based statistical machine translation systems treat arbitrarily continuous words as phrases without considering their rationality.The paper adopts two methods,C-value and phrase cohesion value,to effectively filter the phrase table,reduce its search space while at the same time ameliorate the translation performance.Experiments show that the phrase table can be reduced to 78% of its size with a 0.02 rise of the BLEU score,or to 47.5% of its size with a 0.0158 rise of the BLEU score.
Keywords:Statistical machine translation Phrase table filtration C-value Phrase cohesion value  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号