首页 | 本学科首页   官方微博 | 高级检索  
     

基于AdaBoost.MH算法的汉语多义词消歧
引用本文:刘风成,黄德根,姜鹏.基于AdaBoost.MH算法的汉语多义词消歧[J].中文信息学报,2006,20(3):8-15.
作者姓名:刘风成  黄德根  姜鹏
作者单位:大连理工大学计算机科学与技术系
摘    要:本文提出一种基于AdaBoost MH算法的有指导的汉语多义词消歧方法,该方法利用AdaBoost MH算法对决策树产生的弱规则进行加强,经过若干次迭代后,最终得到一个准确度更高的分类规则;并给出了一种简单的终止算法中迭代的方法;为获取多义词上下文中的知识源,在采用传统的词性标注和局部搭配序列等知识源的基础上,引入了一种新的知识源,即语义范畴,提高了算法的学习效率和排歧的正确率。通过对6个典型多义词和SENSEVAL3中文语料中20个多义词的词义消歧实验,AdaBoost MH算法获得了较高的开放测试正确率(85.75%)。

关 键 词:人工智能  自然语言处理  词义消歧  AdaBoost  MH算法  多知识源  
文章编号:1003-0077(2006)03-0006-08
收稿时间:2005-05-26
修稿时间:2005-05-262005-10-26

Chinese Word Sense Disambiguation with AdaBoost. MH Algorithm
LIU Feng-cheng,HUANG De-gen,JIANG Peng.Chinese Word Sense Disambiguation with AdaBoost. MH Algorithm[J].Journal of Chinese Information Processing,2006,20(3):8-15.
Authors:LIU Feng-cheng  HUANG De-gen  JIANG Peng
Affiliation:Department of Computer Science , Dalian University of Technology
Abstract:An approach based on supervised AdaBoost.MH learning algorithm for Chinese word sense disambiguation is presented.AdaBoost.MH algorithm is employed to boost the accuracy of the weak decision stumps rules for trees and repeatedly calls a learner to finally produce a more accurate rule.A simple stopping criterion is also presented.In order to extract more contextual information,we introduce a new semantic categorization knowledge which is useful for improving the learning efficiency of the algorithm and accuracy of disambiguation,in addition to using two classical knowledge sources,part-of-speech of neighboring words and local collocations.AdaBoost.MH algorithm making use of these knowledge sources achieves 85.75% disambiguation accuracy in open test for 6 typical polysemous words and 20 polysemous words of SENSEVAL3 Chinese corpus.
Keywords:artificial intelligence  natural language processing  word sense disambiguation  AdaBoost  MH algorithm  multiple knowledge sources
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号