首页 | 本学科首页   官方微博 | 高级检索  
     

基于粗糙数据推理的TextRank关键词提取算法
引用本文:周宁,石雯茜,朱昭昭.基于粗糙数据推理的TextRank关键词提取算法[J].中文信息学报,1986,34(9):44-52.
作者姓名:周宁  石雯茜  朱昭昭
作者单位:兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
基金项目:国家自然科学基金(61650207,61841303);教育部人文社会科学基金(19YJC760012)
摘    要:基于图模型的TextRank算法是一种有效的关键词提取算法,在提取关键词时可取得较高准确度。但该算法在构造图的关联边时,所采用的共现窗口规则仅考虑了局部词汇间的关联,并具有较大随意性与不确定性。针对这一问题,该文提出了一种基于粗糙数据推理理论的改进TextRank关键词提取算法,粗糙数据推理可扩大关联范围,增加关联数据,得到的结果更加全面。结合粗糙数据推理理论中的关联规则,该文提出的算法做了以下改进: 依据词义对候选关键词进行划分;再通过粗糙数据推理对不同分类中候选词间的关联关系进行推理。实验结果表明,与传统的TextRank算法相比,改进后算法的提取精度有了明显的提高,证明了利用粗糙数据推理的思想能有效地改善算法提取关键词的性能。

关 键 词:粗糙数据推理  关键词提取  关联规则  TextRank算法  

TextRank Keyword Extraction Algorithm Based on Rough Data-Deduction
ZHOU Ning,SHI Wenqian,ZHU Zhaozhao.TextRank Keyword Extraction Algorithm Based on Rough Data-Deduction[J].Journal of Chinese Information Processing,1986,34(9):44-52.
Authors:ZHOU Ning  SHI Wenqian  ZHU Zhaozhao
Affiliation:School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou, Gansu 730070, China
Abstract:TextRank algorithm based on graph model is an effective keyword extraction algorithm with high accuracy. However, when constructing the edges of a graph, the algorithm adopts the co-occurrence window rule that considers only the association between local words, yielding greater randomness and uncertainty. To address the issue, an improved TextRank keyword extraction algorithm based on rough data-deduction is proposed. In this method, candidate keywords are classified according to word meanings, and the association between candidate words in different classes is deduced by rough data-deduction. The experimental results show that the extraction precision of improved algorithm has been significantly improved.
Keywords:rough data-deduction  keyword extraction  association rule  TextRank algorithm  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号