首页 | 本学科首页   官方微博 | 高级检索  
     

融合TextRank算法的中文短文本相似度计算
引用本文:卢佳伟,陈玮,尹钟. 融合TextRank算法的中文短文本相似度计算[J]. 电子科技, 2009, 33(10): 51-56. DOI: 10.16180/j.cnki.issn1007-7820.2020.10.009
作者姓名:卢佳伟  陈玮  尹钟
作者单位:上海理工大学 光电信息与计算机工程学院,上海 200093
基金项目:国家自然科学基金(61703277)
摘    要:传统的VSM向量空间模型忽略了文本语义,构建的文本特征矩阵具有稀疏性。基于深度学习词向量技术,文中提出一种融合改进TextRank算法的相似度计算方法。该方法利用词向量嵌入的技术来构建文本向量空间,使得构建的向量空间模型具有了语义相关性,同时采用改进的TextRank算法提取文本关键字,增强了文本特征的表达并消除了大量冗余信息,降低了文本特征矩阵的稀疏性,使文本相似度的计算更加高效。不同模型的仿真实验结果表明,融合改进的TextRank算法与Bert词向量技术的方法具有更好的文本相似度计算性能。

关 键 词:文本相似度  提取  TextRank算法  Bert  词向量技术  向量空间模型  
收稿时间:2019-07-21

Chinese Short Text Similarity Calculation Based on TextRank Algorithm
LU Jiawei,CHEN Wei,YIN Zhong. Chinese Short Text Similarity Calculation Based on TextRank Algorithm[J]. Electronic Science and Technology, 2009, 33(10): 51-56. DOI: 10.16180/j.cnki.issn1007-7820.2020.10.009
Authors:LU Jiawei  CHEN Wei  YIN Zhong
Affiliation:School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
Abstract:The traditional VSM vector space model often ignores text semantics, and the constructes text feature matrix is sparse. Based on the word vector technology of deep learning, this paper proposes a similarity calculation method that integrates the improved TextRank algorithm. This method uses the word vector embedding technology to build a text vector space, which makes the vector space model possess the semantic relevance. At the same time, with the improved TextRank algorithm to extract text keywords, the expression of text feature is enhanced and a large amount of redundant information is eliminated. The text characteristic of sparse matrix is reduced, which makes the text similarity computing more efficient. The results of the simulation experiments of different models show that the fusion of the improved TextRank algorithm with Bert word vector technology have better performance of text similarity calculation.
Keywords:text similarity  extraction  TextRank slgorithm  Bert  word vector technique  vector space model  
点击此处可从《电子科技》浏览原始摘要信息
点击此处可从《电子科技》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号