首页 | 本学科首页   官方微博 | 高级检索  
     

知识图谱划分算法研究综述
引用本文:王鑫,陈蔚雪,杨雅君,张小旺,冯志勇. 知识图谱划分算法研究综述[J]. 计算机学报, 2021, 44(1): 235-260. DOI: 10.11897/SP.J.1016.2021.00235
作者姓名:王鑫  陈蔚雪  杨雅君  张小旺  冯志勇
作者单位:天津大学智能与计算学部 天津300354;天津市认知计算与应用重点实验室 天津300354
基金项目:国家自然科学基金;本课题得到国家重点研发计划项目
摘    要:知识图谱是人工智能的重要基石,因其包含丰富的图结构和属性信息而受到广泛关注.知识图谱可以精确语义描述现实世界中的各种实体及其联系,其中顶点表示实体,边表示实体间的联系.知识图谱划分是大规模知识图谱分布式处理的首要工作,对知识图谱分布式存储、查询、推理和挖掘起基础支撑作用.随着知识图谱数据规模及分布式处理需求的不断增长,...

关 键 词:知识图谱  图划分  多级划分  流划分  分布式

Research on Knowledge Graph Partitioning Algorithms:A Survey
WANG Xin,CHEN Wei-Xue,YANG Ya-Jun,ZHANG Xiao-Wang,FENG Zhi-Yong. Research on Knowledge Graph Partitioning Algorithms:A Survey[J]. Chinese Journal of Computers, 2021, 44(1): 235-260. DOI: 10.11897/SP.J.1016.2021.00235
Authors:WANG Xin  CHEN Wei-Xue  YANG Ya-Jun  ZHANG Xiao-Wang  FENG Zhi-Yong
Affiliation:(College of Intelligence and Computing,Tianjin University,Tianjin 300354;Tianjin Key Laboratory of Cognitive Computing and Application,Tianjin 300354)
Abstract:Knowledge graphs are important cornerstones of artificial intelligence,which not only have the graph structure of the general graph model but also contain rich attribute information of vertices and edges.Therefore,knowledge graphs have received widespread attention in practical tasks.Knowledge graphs can use accurate semantics to describe the various entities and their connections in the real world,where the vertices represent entities,and the edges represent connections among these entities.Knowledge graph partitioning is a fundamental task for distributed processing of large-scale knowledge graphs,and it is the essential support for a series of graph processing operations including distributed storage,query processing,reasoning,and data mining.With the increasing scale of knowledge graphs and more requirements of distributed processing,it is difficult to partition large-scale knowledge graphs,which has become a hot issue in the current research on knowledge graphs.Starting from the definitions of knowledge graphs and graph partitioning,we systematically introduce various algorithms currently available for knowledge graph partitioning,including basic graph partitioning algorithms,multilevel graph partitioning algorithms,streaming graph partitioning algorithms,distributed graph partitioning algorithms,and other types of graph partitioning algorithms.First,four basic graph partitioning algorithms are reviewed,including spectral partitioning algorithms,geometric partitioning algorithms,branch and bound algorithms,KL and its derivative algorithms.This type of algorithms usually works on small-scale graphs or as subroutines for more sophisticated algorithms.Second,two multilevel graph partitioning algorithms are introduced,one of which uses the matching-based algorithm during the coarsening phase,and the other uses the aggregation-based algorithm.After coarsening,they partition the much smaller graph and then project the result back to the original graph.Third,streaming graph partitioning algorithms are described,which load a graph as a sequence of vertices or edges and assign each element to the corresponding partitions.The streaming graph partitioning algorithms can be further divided into three subcategories:the hash algorithm,the greedy algorithm,the Fennel algorithm,and their derivative algorithms.Fourth,we introduce distributed graph partitioning algorithms for partitioning large-scale graphs in distributed environments.We choose three representative algorithms,namely the KaPPa algorithm,the JA-BE-JA algorithm,the lightweight repartitioning algorithm,and then describe derivative algorithms of these distributed algorithms.Meanwhile,as the other types of graph partitioning algorithms,we present two recent graph partitioning algorithms,one is the label propagation algorithm,and the other is the query workload-based algorithm.By conducting extensive experiments on both synthetic and real knowledge graph datasets,we compare the performance of five representative knowledge graph partitioning algorithms in partitioning effects,query processing,and graph data mining.We analyze the experimental results in detail and draw general conclusions,and extend them to the reasoning tasks.The performance evaluation conclusions of knowledge graph partitioning algorithms are obtained based on these experiments and analyses.Finally,based on the analysis and comparison of existing methods,we summarize the main challenges in the current research on knowledge graph partitioning,propose the corresponding issues that can be investigated,and look forward to the future research directions on this topic.
Keywords:knowledge graph  graph partitioning  multilevel partitioning  streaming partitioning  distribution
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号