首页 | 本学科首页   官方微博 | 高级检索  
     

改进的基于知网的词语相似度算法
引用本文:王小林,王义.改进的基于知网的词语相似度算法[J].计算机应用,2011,31(11):3075-3077.
作者姓名:王小林  王义
作者单位:1. 安徽工业大学 计算机学院,安徽 马鞍山 2430022. 山东省淄博市周村区人民医院 信息科,山东 淄博 255300
基金项目:国家自然科学基金,安徽省高校省级自然科学基金
摘    要:词语相似度计算在文本分类、问答系统、机器翻译、文本聚类等有着广泛的应用。词语相似度计算的研究工作一般都是基于《知网》的义原的层面上,根据义原之间的距离和义原本身的层次深度,进行词语相似度的计算。基于以上研究,提出了一种新的改进的词语相似度算法,首先根据义项中各类义原的个数不同,提出了一种新的变系数义项相似度计算方法;其次从词性的角度,认为词语义项中的不同词性对词语相似度的贡献度不同,剔除不同词性义项之间的组合。实验结果证明,改进的算法结果在原有基础上得到较好的提升,大幅度降低了相似度计算的复杂度,提高了运算效率。

关 键 词:词语相似度  知网  义原  义项  词性  
收稿时间:2011-05-10
修稿时间:2011-06-26

Improved word similarity algorithm based on HowNet
WANG Xiao-lin,WANG Yi.Improved word similarity algorithm based on HowNet[J].journal of Computer Applications,2011,31(11):3075-3077.
Authors:WANG Xiao-lin  WANG Yi
Affiliation:1. School of Computer, Anhui University of Technology, Maanshan Anhui 243002, China2. Information Department, Zhoucun People’s Hospital of Zibo City of Shandong Province, Zibo Shandong 255300, China
Abstract:The word similarity computation is widely used in text classification, question-answer system, machine-translation and text clustering. Research of this computation is generally based on HowNet, according to the distance and the depth of sememes. Based on above, an improved method of word similarity computation was proposed as follows. Firstly, a new variable coefficient of homonym similarity computing was proposed according to the count of homonym. Secondly, it took part of speech into account and argued that the part of speech of homonym is different in contributions to word similarity and removed the combinations of homonyms with different part of speech. The experimental results show that the result obtained through this newly-improved computation method is better with less complex calculation and higher calculation efficiency.
Keywords:word similarity                                                                                                                        HowNet                                                                                                                        sememe                                                                                                                        homonym                                                                                                                        part of speech
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号