一种改进的基于《知网》的词语相似度计算方法 Modified word similarity computation approach based on HowNet期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种改进的基于《知网》的词语相似度计算方法

引用本文：	林丽,薛方,任仲晟. 一种改进的基于《知网》的词语相似度计算方法[J]. 计算机应用, 2009, 29(1): 217-220

作者姓名：	林丽薛方任仲晟

作者单位：	1. 集美大学,计算机工程学院,福建,厦门,361021 2. 福建师范大学,计算机实验中心,福州,350007

摘要：	《知网》是一部比较详尽的中文语义知识词典，共用1618个义原描述词语，故相关的词语用《知网》的概念描述时，有相同的义原。通过这一规律，与当前的词语相似度计算方法结合，提出改进的方法计算相关词对的相似度。并引入弱义原的概念，排除弱义原对词语相似度计算的干扰。实验证明：该改进方法更符合人的直观，更适用于文本挖掘。
关键词：	《知网》词语相似度相关词对弱义原
收稿时间：	2008-07-16
Modified word similarity computation approach based on HowNet

LIN Li,XUE Fang,REN Zhong-sheng. Modified word similarity computation approach based on HowNet[J]. Journal of Computer Applications, 2009, 29(1): 217-220

Authors:	LIN Li XUE Fang REN Zhong-sheng

Affiliation:	1. College of Computer Engineering;Jimei University;Xiamen Fujian 361021;China;2. Center of Computer Laboratory;Fujian Normal University;Fuzhou Fujian 350007;China

Abstract:	HowNet is a lexical base with rich semantic information. It uses 1618 sememes to describe words. The related words have the same sememe when they are described by the HowNet. Combined with the current computation algorithm of the words' similarity, the paper proposed an improved algorithm to compute the similarity between the related words. It also introduced concept about weak sememes and excluded such sememes' interference when they appeared in the computation of the word's similarity. The experiment proves the improved word similarity computation meets the peoples' intuition and text mining better.

Keywords:	HowNet word similarity related word weak sememe
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏