基于k-means聚类的无导词义消歧 An Unsupervised Approach to Word Sense Disambiguation Based on HowNet期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于k-means聚类的无导词义消歧

引用本文：	陈浩,何婷婷,姬东鸿.基于k-means聚类的无导词义消歧[J].中文信息学报,2005,19(4):11-17.

作者姓名：	陈浩何婷婷姬东鸿

作者单位：	1. 华中师范大学计算机科学系湖北武汉　430079 ; 2. 新加坡信息通讯研究所新加坡　119613

基金项目：	国家语委语言文字应用科研项目，湖北省自然科学基金

摘要：	无导词义消歧避免了人工词义标注的巨大工作量,可以适应大规模的多义词消歧工作,具有广阔的应用前景。这篇文章提出了一种无导词义消歧的方法,该方法采用二阶context 构造上下文向量,使用k-means算法进行聚类,最后通过计算相似度来进行词义的排歧. 实验是在抽取术语的基础上进行的,在多个汉语高频多义词的两组测试中取得了平均准确率82167 %和80187 %的较好的效果。
关键词：	计算机应用中文信息处理词义消歧 HowNet 二阶context k-means 聚类
文章编号：	1003-0077(2005)04-0010-07
修稿时间：	2004年7月7日
An Unsupervised Approach to Word Sense Disambiguation Based on HowNet

CHEN Hao,HE Ting-ting,JI Dong-hong.An Unsupervised Approach to Word Sense Disambiguation Based on HowNet[J].Journal of Chinese Information Processing,2005,19(4):11-17.

Authors:	CHEN Hao HE Ting-ting JI Dong-hong

Affiliation:	1.Department of computer science , central china normal university , Wuhan , Hubei 430079 , China ;2.Institute for Infocomm Research , Heng Mui Keng Terrace , 21 , Singapore 119613

Abstract:	An unsupervised WSD(word sense disambiguation) can avoid big labor cost and it is possible to adjust to deal with large-scale ,so WSD has extensive applications in many fields. This paper presents an unsupervised approach which constructs context vector by means of second-order context, clustering by k-means and disambiguates by calculating the similarity. Our experiments are based on the extraction of term and average accuracy is 82.62% and 80.87% for 8 ambiguous words in open test by this method.

Keywords:	HowNet
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏