首页 | 本学科首页   官方微博 | 高级检索  
     

基于DTW和改进匈牙利算法的句子语义相似度研究
引用本文:钮焱,李星,李军,刘宇强,Jepkemei Judith.基于DTW和改进匈牙利算法的句子语义相似度研究[J].计算机与数字工程,2021,49(2):242-247.
作者姓名:钮焱  李星  李军  刘宇强  Jepkemei Judith
作者单位:湖北工业大学 武汉 430068;湖北工业大学 武汉 430068;湖北工业大学 武汉 430068;湖北工业大学 武汉 430068;湖北工业大学 武汉 430068
摘    要:句子语义相似度的研究在自然语言处理等领域发挥着重要的作用。针对现有汉语句子相似度研究中存在的语义特征难以分析以及语序影响的问题,提出了一种基于DTW和匈牙利算法相结合的语义句子相似度处理模型。模型首先使用Word2vec深度学习模型训练百度新闻语料,得到200维的包含语义特征的词向量词典,并建立词向量空间,根据词向量组成的多维空间曲线,通过计算句子曲线之间相互转换的距离和复杂度来表示句子语义相似度,模型采用了DTW矩阵和改进的匈牙利算法,并对DTW矩阵做最短路径规划。实验结果表明,与现有的夹角余弦相似度等句子相似度计算方法相比,该方法在语序较乱但语义相近的情况下也能得到较为准确的相似度结果值。

关 键 词:词向量  DTW  匈牙利算法  语义相似度  语义特征

Research on Sentence Semantic Similarity Based on DTW and Improved Hungarian Algorithm
NIU Yan,LI Xing,LI Jun,LIU Yuqiang,Jepkemei Judith.Research on Sentence Semantic Similarity Based on DTW and Improved Hungarian Algorithm[J].Computer and Digital Engineering,2021,49(2):242-247.
Authors:NIU Yan  LI Xing  LI Jun  LIU Yuqiang  Jepkemei Judith
Affiliation:(Hubei University of Technology,Wuhan 430068)
Abstract:The study of semantic similarity of sentences plays an important role in the field of natural language processing.Aiming at the problem that the existing semantic features of Chinese sentence similarity are difficult to analyze and the influence of word order,a semantic sentence similarity processing model based on DTW and Hungarian algorithm is proposed.The model first uses the Word2vec deep learning model to train Baidu news corpus,obtains a 200-dimensional word vector dictionary containing se?mantic features,and establishes a word vector space.According to the multi-dimensional space curve composed of word vectors,the distance between the sentence curves is calculated.Complexity to represent the semantic similarity of sentences,the model uses the DTW matrix and the improved Hungarian algorithm,and the shortest path planning for the DTW matrix.The experimental re?sults show that compared with the existing sentence similarity calculation methods such as the angle cosine similarity,the method can obtain more accurate similarity result values when the word order is chaotic but the semantics are similar.
Keywords:word vector  DTW  Hungarian algorithm  semantic similarity  semantic feature
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号