基于上下文翻译的有监督词义消歧研究 Supervised WSD Method Based on Context Translation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于上下文翻译的有监督词义消歧研究

引用本文：	杨陟卓.基于上下文翻译的有监督词义消歧研究[J].计算机科学,2017,44(4):252-255, 280.

作者姓名：	杨陟卓

作者单位：	山西大学计算机科学与信息技术学院太原030006

基金项目：	本文受国家自然科学基金项目(61502287),山西省高校科技创新项目(2015105),国家863计划项目(2015AA015407)资助

摘要：	针对目前有监督词义消歧方法存在的数据稀疏问题,提出一种基于上下文翻译的词义消歧方法。该方法假设由歧义词上下文的译文所组成的语境与原上下文语境所表述的意义相似。根据此假设,首先,将译文所组成的上下文生成大量的伪训练语料；然后,利用真实训练语料和伪训练语料训练一个贝叶斯消歧模型；最后,利用该消歧模型决策歧义词的词义。实验结果表明, 与传统的消歧方法相比,所提出的方法消歧准确率提高了4.35%,并且超过了参加SemEval-2007测评的最好的有监督消歧系统。
关键词：	词义消歧上下文扩充机器翻译伪训练语料贝叶斯模型
收稿时间：	2016/3/30 0:00:00
修稿时间：	2016/6/21 0:00:00
Supervised WSD Method Based on Context Translation

YANG Zhi-zhuo.Supervised WSD Method Based on Context Translation[J].Computer Science,2017,44(4):252-255, 280.

Authors:	YANG Zhi-zhuo

Affiliation:	School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China

Abstract:	In order to overcome the data sparseness problem for supervised WSD methods,this paper presented a WSD method based on context translation.The method assumes that the context consisted of the ambiguous words has the similar meaning as the context in the original.Under this assumption,first,a large number of pseudo training data are generated in the context of the target text.Then the Bayesian model is trained by utilizing both authentic and pseudo training data.Finally,the method performs word sense disambiguation by using Bayesian model.Experimental results show that the proposed method can significantly improve traditional WSD accuracy by 4.35%,and outperforms the best participating system in the SemEval-2007 evaluation.

Keywords:	Data sparseness Context expansion Machine translation Pseudo training data Bayesian model

	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏