首页 | 本学科首页   官方微博 | 高级检索  
     

基于锚点词对的双语词对齐算法
引用本文:张孝飞,陈肇雄,黄河燕,王建德. 基于锚点词对的双语词对齐算法[J]. 小型微型计算机系统, 2006, 27(2): 330-334
作者姓名:张孝飞  陈肇雄  黄河燕  王建德
作者单位:中国科学院,计算机语言信息工程研究中心,北京,100083
基金项目:中国科学院资助项目;国家科技攻关项目
摘    要:双语词对齐是指在源、译文中找到词汇级的对译关系,是自然语言处理领域一个非常有用而又比较困难的研究课题.其中涉及词法、语法、语义以及英汉语言问的固有差异和翻译习惯等诸多问题.文章在词法分析的基础上,利用有限的语言资源(主要只使用了一部双语词典),采取多级匹配和消歧策略,将词对齐问题转化为迭代求解锚点词对的过程,使得词对齐既有较高的准确率又有较高的召回率.经过对真实语料的测试,词对齐准确率达93.0%,召回率迭77.3%.F值达84.2%.基本上满足了有关应用的实际要采.

关 键 词:自然语言处理  双语词对齐  语料  锚点词对
文章编号:1000-1220(2006)02-0330-05
收稿时间:2004-09-02
修稿时间:2004-09-02

Word-Alignment Algorithm Based on Anchor Word-Pair
ZHANG Xiao-fei,CHEN Zhao-xiong,HUANG He-yan,WANG Jian-de. Word-Alignment Algorithm Based on Anchor Word-Pair[J]. Mini-micro Systems, 2006, 27(2): 330-334
Authors:ZHANG Xiao-fei  CHEN Zhao-xiong  HUANG He-yan  WANG Jian-de
Abstract:Word-alignment algorithm is to find the corresponding translation of words between the source language sentence and the target language sentence. It is a very useful and difficult task which involves in many problems such as accordance,syntax,sementics and inherent difference between and English and Chinese,and human's translation habits,etc.In this paper,a new algorithm is proposed based on accidence analysis: the word-alignment problem is transformed to an iterative solution of anchor word-pair by multi-level match and disambiguity algorithm which only use a bilingual dictionary.The experiment results show that the word-alignment precision is 93.0%,recall is 77.3%and F-score is 84.2%.
Keywords:NLP    bilingual word-alignment   corpora   anchor word-pair
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号