基于加权二部图的汉日词对齐 Word Alignment Between Chinese and Japanese Based on Weighted Bipartite Graph期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于加权二部图的汉日词对齐

引用本文：	吴宏林,刘绍明,于戈.基于加权二部图的汉日词对齐[J].中文信息学报,2007,21(5):101-106.

作者姓名：	吴宏林刘绍明于戈

作者单位：	1. 东北大学信息学院计算机软件与理论研究所,辽宁沈阳 110004; 2. 日本富士施乐公司日本神奈川 2590157

基金项目：	富士施乐访问研究员计划

摘要：	高效的自动词对齐技术是词对齐语料库建设的关键所在。当前很多词对齐方法存在以下不足: 未登录词问题、灵活翻译问题和全局最优匹配问题。针对以上不足,该文提出加权二部图最大匹配词对齐模型,利用二部图为双语句对建模,利用词形、语义、词性和共现等信息计算单词间的相似度,利用加权二部图最大匹配获得最终对齐结果。在汉日词语对齐上的实验表明,该方法在一定程度上解决了以上三点不足,F-Score为80%,优于GIZA_{_</sub>_<sub>++}的72%。
关键词：	计算机应用中文信息处理词对齐二部图匹配
文章编号：	1003-0077（2007）05-0101-06
收稿时间：	2007-04-03
修稿时间：	2007-04-032007-06-29
Word Alignment Between Chinese and Japanese Based on Weighted Bipartite Graph

WU Hong-lin,LIU Shao-ming,YU Ge.Word Alignment Between Chinese and Japanese Based on Weighted Bipartite Graph[J].Journal of Chinese Information Processing,2007,21(5):101-106.

Authors:	WU Hong-lin LIU Shao-ming YU Ge

Affiliation:	1. Institute of Computer Software and Theory, Northeastern University, Shenyang, Liaoning 110004, China; 2. Corporate Research Group, Fuji Xerox, Co., Ltd., Kanagawa 2590157, Japan

Abstract:	The paper proposed a word alignment model which matches words by maximum matching on a weighted bipartite graph and measures word similarity in terms of morphological similarity,semantic distance,part of speech and co-occurrence.The experiments on Chinese-Japanese word aligment shows that this model can partly solve some problems of existing word alignment methods,such as the unknown word problem,the synonym problem and the global optimization problem.In the experiment,the F-score of our method is 80%,better than the F-score 72% of GIZA .

Keywords:	computer application Chinese information processing word alignment bipartite graph matching
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏