首页 | 本学科首页   官方微博 | 高级检索  
     

结合依存关系与同义词词林的相似度计算
引用本文:付鹏斌,陈帅帅,杨惠荣,李建君.结合依存关系与同义词词林的相似度计算[J].计算机技术与发展,2020(1):13-18.
作者姓名:付鹏斌  陈帅帅  杨惠荣  李建君
作者单位:北京工业大学信息学部
基金项目:北京市自然科学基金资助项目(4153058)
摘    要:设计了一种基于依存关系与同义词词林相结合的语义相似度计算方法。该方法通过依存关系分别提取两个文本的关系路径,同时基于同义词词林计算两个文本之间关系路径的语义相似度。在计算两个文本之间的语义相似度时,使用语言技术平台(language technology platform,LTP)对文本进行中文分词以及获取文本的依存关系图,从中提取关系路径,从而可以结合关系路径和同义词词林计算两个文本之间的语义相似度。通过实验,获得的平均偏差率为13.83%。实验结果表明,结合依存关系与同义词词林的语义相似度方法在准确率上相比较基于同义词词林的语义相似度和基于依存关系的语义相似度有了一定的提高。

关 键 词:依存关系  同义词词林  语义相似度  关系路径  平均偏差率

Similarity Calculation between Dependency Relation and Tongyici Cilin
FU Peng-bin,CHEN Shuai-shuai,YANG Hui-rong,LI Jian-jun.Similarity Calculation between Dependency Relation and Tongyici Cilin[J].Computer Technology and Development,2020(1):13-18.
Authors:FU Peng-bin  CHEN Shuai-shuai  YANG Hui-rong  LI Jian-jun
Affiliation:(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
Abstract:We present a method of calculating semantic similarity based on the combination of dependency relation and Tongyici Cilin.This method extracts the relationship paths of two texts by the dependency relation,and calculates the semantic similarity of the relationship paths between two texts based on Tongyici Cilin.When calculating the semantic similarity between two texts,we use language technology platform(LTP)to segment the Chinese text and obtain the dependency graph of the text,and extract the relationship path from it,so that we can calculate the semantic similarity between the two texts by combining the relationship path and Tongyici Cilin.The average deviation rate is 13.83%in the experiment which shows that the accuracy of the semantic similarity method based on the dependency relation and Tongyici Cilin is better than that based on Tongyici Cilin and based on the dependency relation.
Keywords:dependency relation  Tongyici Cilin  semantic similarity  relationship path  average deviation rate
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号