首页 | 本学科首页   官方微博 | 高级检索  
     

基于词典属性特征的粗粒度词义消歧
引用本文:吴云芳,金澎,郭涛.基于词典属性特征的粗粒度词义消歧[J].中文信息学报,2007,21(2):1-8.
作者姓名:吴云芳  金澎  郭涛
作者单位:北京大学 计算语言学研究所,北京 100871)
基金项目:国家重点基础研究发展计划(973计划)
摘    要:本文依据《现代汉语语法信息词典》中对词语多义的属性特征描述,对《人民日报》语料中155 个词语共 4 996 个同形实例进行了粗粒度词义自动消歧实验,同时用贝叶斯算法进行了比较测试。基于词典属性特征的消歧方法在同形层面上准确率达到 90%, 但召回率偏低。其优点在于两个方面: 1) 不受词义标注语料库规模的影响;2) 对特定词语意义的消歧准确率可达到100%。本文也讨论了适用于不同词类的消歧特征。

关 键 词:人工智能  自然语言处理  特征  词义  词义消歧  贝叶斯分类法  
文章编号:1003-0077(2007)02-0003-06
收稿时间:2005-11-04
修稿时间:2006-12-20

Coarse-Grained Word Sense Disambiguation Using Features Described in the Lexicon
WU Yun-fang,JIN Peng,GUO Tao.Coarse-Grained Word Sense Disambiguation Using Features Described in the Lexicon[J].Journal of Chinese Information Processing,2007,21(2):1-8.
Authors:WU Yun-fang  JIN Peng  GUO Tao
Affiliation:Institute of Computational Linguistics, Peking University, Beijing 100871, China
Abstract:This paper presents a simple but effective feature-based approach to Chinese word sense disambiguation using the distributional features available from the Grammatical Knowledge-base of Contemporary Chinese. The test data is the sense-tagged corpus of People’s Daily. A Nave Bayes classifier is also tried as a comparable statistical method. The feature-based approach achieves precision of 90%, which is comparable to the NB classifier. The striking advantages of the feature-based approach are 1) It is not influenced by the data size, and 2) It can disambiguate some specific words with precision of 100%. The features appropriate for different parts of speech in Chinese WSD are also discussed. This paper demonstrates that sense features described in the lexicon are worth including in WSD.
Keywords:artificial intelligence  natural language processing~ feature  word sense  word sense disambiguation  Naive Bayes classifier
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号