首页 | 本学科首页   官方微博 | 高级检索  
     

基于知网与词林的词语语义相似度计算
引用本文:朱新华,马润聪,孙 柳,陈宏朝.基于知网与词林的词语语义相似度计算[J].中文信息学报,2016,30(4):29-36.
作者姓名:朱新华  马润聪  孙 柳  陈宏朝
作者单位:广西师范大学 计算机科学与信息工程学院,广西 桂林 541004
基金项目:国家自然科学基金(61363036)
摘    要:该文提出了一种综合知网与同义词词林的词语语义相似度计算方法。知网部分根据义原层次结构的特征,采用了顶部平缓而底部陡峭的曲线单调递减的边权重策略,改进了现有的义原相似度算法;词林部分采用以词语距离为主要因素、分支节点数和分支间隔为微调节参数的方法,改进了现有的词林词语相似度算法。然后再根据词语的分布情况,采用综合考虑知网与同义词林的动态加权策略计算出最终的词语语义相似度。该方法充分利用了词语在知网与词林中的语义信息,极大地扩充了可计算词语的范围,同时也提高了词语相似度计算的准确率。

关 键 词:语义相似度  知网  同义词词林  语义距离  

Word Semantic Similarity Computation Based on HowNet and CiLin
ZHU Xinhua,MA Runcong,SUN Liu,CHEN Hongchao.Word Semantic Similarity Computation Based on HowNet and CiLin[J].Journal of Chinese Information Processing,2016,30(4):29-36.
Authors:ZHU Xinhua  MA Runcong  SUN Liu  CHEN Hongchao
Affiliation:College of Computer Science & Information Technology,Guangxi Normal University, Guilin, Guangxi 541004, China
Abstract:A word semantic similarity computation method based on the HowNet and CiLin is proposed in this paper. First, according to the characteristics of sememe hierarchical structure, an edge weighting strategy of monotonic decreasing curve with flat top and steep bottom is used in the HowNet part. In the CiLin part, a special method of taking the distance between words as the main factor and the branch node quantity and branch interval as micro-adjustable parameters is used. Then, according to the distribution of words, a dynamic weighting strategy of considering both HowNet and CiLin is used to calculate the final similarity, which greatly expands the computable range of words and improves the computation accuracy of word similarity.
Keywords:semantic similarity  HowNet  CiLin  semantic distance  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号