首页 | 本学科首页   官方微博 | 高级检索  
     

基于模板的机器翻译系统中模板库的自动构建技术
引用本文:林贤明,李堂秋,史晓东.基于模板的机器翻译系统中模板库的自动构建技术[J].计算机应用,2004,24(9):127-128,135.
作者姓名:林贤明  李堂秋  史晓东
作者单位:厦门大学,计算机科学系,福建,厦门,361005
基金项目:国家 8 63计划项目 (2 0 0 1AA1 1 4 1 1 0 ),福建省科技重点项目 (2 0 0 1H0 2 3)
摘    要:基于模板的机器翻译(Template Based Machine Translation,TBMT)系统需要一个具有较大规模、句型覆盖面广泛的模板库,而这样的模板库单凭手工无法构建,需要利用计算机自动构建,提出了一种利用基于动态规划的相似模型与基于系统聚类法的分类模型,从句子对齐的双语语料库中抽取模板库的方法。该方法是在句子对齐的语料库中,首先运用系统聚类法对其进行聚类,使得包含相同模板的句子对被聚成一类,然后根据句子的相似度模型计算句子之间的相似度,进而从各个子类中将模板抽取出来,构建出整个模板库。

关 键 词:模板库  自动抽取  系统聚类  相似度模型  基于模板的机器翻译
文章编号:1001-9081(2004)09-0127-02

Auto-extraction of template library intemplate based machine translation(TBMT) system
LIN Xian-ming,LI Tang-qiu,SHI Xiao-dong.Auto-extraction of template library intemplate based machine translation(TBMT) system[J].journal of Computer Applications,2004,24(9):127-128,135.
Authors:LIN Xian-ming  LI Tang-qiu  SHI Xiao-dong
Abstract:TBMT need to include a large template library which must be built automatically. We presented a method to build the template library from the parallel corpora with setence aligned. First, we classified the sentences into several sentence sets by system cluster method so that the sentences in the same class had the same template. Then we calculated the similar degree of every two sentences according to the similar degree model. Last we extracted the template from every set. Thus the template library was built up.
Keywords:template library  auto-extration  system cluster  similarity model  TBMT
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号