首页 | 本学科首页   官方微博 | 高级检索  
     

基于上下文翻译的有监督词义消歧研究
引用本文:杨陟卓.基于上下文翻译的有监督词义消歧研究[J].计算机科学,2017,44(4):252-255, 280.
作者姓名:杨陟卓
作者单位:山西大学计算机科学与信息技术学院 太原030006
基金项目:本文受国家自然科学基金项目(61502287),山西省高校科技创新项目(2015105),国家863计划项目(2015AA015407)资助
摘    要:针对目前有监督词义消歧方法存在的数据稀疏问题,提出一种基于上下文翻译的词义消歧方法。该方法假设由歧义词上下文的译文所组成的语境与原上下文语境所表述的意义相似。根据此假设,首先,将译文所组成的上下文生成大量的伪训练语料;然后,利用真实训练语料和伪训练语料训练一个贝叶斯消歧模型;最后,利用该消歧模型决策歧义词的词义。实验结果表明, 与传统的消歧方法相比,所提出的方法消歧准确率提高了4.35%,并且超过了参加SemEval-2007测评的最好的有监督消歧系统。

关 键 词:词义消歧  上下文扩充  机器翻译  伪训练语料  贝叶斯模型
收稿时间:2016/3/30 0:00:00
修稿时间:2016/6/21 0:00:00

Supervised WSD Method Based on Context Translation
YANG Zhi-zhuo.Supervised WSD Method Based on Context Translation[J].Computer Science,2017,44(4):252-255, 280.
Authors:YANG Zhi-zhuo
Affiliation:School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
Abstract:In order to overcome the data sparseness problem for supervised WSD methods,this paper presented a WSD method based on context translation.The method assumes that the context consisted of the ambiguous words has the similar meaning as the context in the original.Under this assumption,first,a large number of pseudo training data are generated in the context of the target text.Then the Bayesian model is trained by utilizing both authentic and pseudo training data.Finally,the method performs word sense disambiguation by using Bayesian model.Experimental results show that the proposed method can significantly improve traditional WSD accuracy by 4.35%,and outperforms the best participating system in the SemEval-2007 evaluation.
Keywords:Data sparseness  Context expansion  Machine translation  Pseudo training data  Bayesian model
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号