首页 | 本学科首页   官方微博 | 高级检索  
     

三角形的并行枚举算法
引用本文:王卓,索勃,潘巍. 三角形的并行枚举算法[J]. 计算机应用, 2017, 37(12): 3397-3400. DOI: 10.11772/j.issn.1001-9081.2017.12.3397
作者姓名:王卓  索勃  潘巍
作者单位:西北工业大学 计算机学院, 西安 710129
基金项目:中国科技部国家重点研发计划项目(2016YFB1000703);国家863重大项目(2015AA015307);国家自然科学基金重点项目(61332006);国家自然科学基金面上项目(61672432,61472321);国家自然科学基金青年项目(61502390)。
摘    要:经典GT算法是三角形并行枚举算法的MapReduce实现,然而该算法只能枚举全图的三角形结构,对部分顶点构成的三角形结构无法直接进行枚举。针对此问题,提出一种直接枚举部分顶点构成三角形结构的并行算法。首先,通过分析被选点的分布,给出被选点构成三角形的所有组合集合;然后,通过对该集合的筛选,实现对部分点构成三角形结构的直接枚举;最后,将该算法在Spark系统实现,以实现该算法的高效性和广泛性。在人工生成数据集和真实数据集上与GT算法进行对比实验,实验结果表明,所提改进算法的运行时间只有GT算法运行时间的1/3,在Spark上的运行时间仅是Hadoop上运行时间的1/7。该算法可用于更高效地直接生成图中任意点所构成的三角形数据集。

关 键 词:三角形枚举  大规模图数据  MapReduce  部分点枚举  Spark  
收稿时间:2017-07-04
修稿时间:2017-09-05

Parallel algorithm for triangle enumeration
WANG Zhuo,SUO Bo,PAN Wei. Parallel algorithm for triangle enumeration[J]. Journal of Computer Applications, 2017, 37(12): 3397-3400. DOI: 10.11772/j.issn.1001-9081.2017.12.3397
Authors:WANG Zhuo  SUO Bo  PAN Wei
Affiliation:School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710129, China
Abstract:The classical Graph Twiddling (GT) algorithm is the MapReduce implementation of triangle parallel enumeration algorithm. However, the GT algorithm can only enumerate the triangle structure of whole graph and can not enumerate the triangle structure of candidate vertexes directly. To solve the problem, a parallel algorithm was proposed for directly enumerating the triangle structure of candidate vertexes. Firstly, the set of all the combinations of candidate vertexes for forming triangle was given by analyzing the distribution of candidate vertexes. Then, through the screening of the set, the triangle structure of candidate vertexes was directly enumerated. Finally, the proposed algorithm was implemented on Spark to achieve high efficiency and popularity. The contrast experiment was completed on artificial datasets and real datasets. The experimental results show that, compared with the GT algorithm, the running time of the proposed algorithm is only 1/3 of the running time of GT algorithm, and the running time on Spark is only 1/7 of the running time on Hadoop. The proposed algorithm can be used to generate the triangle dataset of any candidate vertex directly and efficiently.
Keywords:triangle enumeration   large-scale graph data   MapReduce   candidate vertex enumeration   Spark
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号