首页 | 本学科首页   官方微博 | 高级检索  
     

基于MapReduce的图结构聚类算法
引用本文:张伟鹏,李振军,李荣华,刘宇鸿,毛睿,乔少杰.基于MapReduce的图结构聚类算法[J].软件学报,2018,29(3):627-641.
作者姓名:张伟鹏  李振军  李荣华  刘宇鸿  毛睿  乔少杰
作者单位:深圳大学计算机软件学院, 广东深圳 518060,深圳大学计算机软件学院, 广东深圳 518060,深圳大学计算机软件学院, 广东深圳 518060,深圳大学计算机软件学院, 广东深圳 518060,深圳大学计算机软件学院, 广东深圳 518060,成都信息工程大学网络空间安全学院, 四川成都 61022
基金项目:国家自然科学基金项目(61402292,61772091);国家自然科学基金广东省联合基金项目(U1301252);教育部人文社会科学研究规划基金(15YJAZH058)
摘    要:图结构聚类(SCAN)是一种著名的基于密度的图聚类算法。该算法不仅能够找到图中的聚类结构,而且还能发现图中的Hub节点和离群节点。然而,随着图数据规模越来越大,传统的SCAN算法的复杂度为O(m1.5)(m为图中边的条数),因此很难处理大规模的图数据。为了解决SCAN算法的可扩展性问题,本文提出了一种新颖的基于MapReduce的海量图结构聚类算法MRSCAN。具体地,我们提出了一种计算核心节点,以及两种合并聚类的MapReduce算法。最后,在多个真实的大规模图数据集上进行实验测试,实验结果验证了算法的准确性、有效性,以及可扩展性。

关 键 词:图数据  并行计算模型  MapReduce  图结构聚类
收稿时间:2017/8/3 0:00:00
修稿时间:2017/9/5 0:00:00

MapReduce-Based Graph Structural Clustering Algorithm
ZHANG Wei-Peng,LI Zhen-Jun,LI Rong-Hu,LIU Yu-Hong,MAO Rui and QIAO Shao-Jie.MapReduce-Based Graph Structural Clustering Algorithm[J].Journal of Software,2018,29(3):627-641.
Authors:ZHANG Wei-Peng  LI Zhen-Jun  LI Rong-Hu  LIU Yu-Hong  MAO Rui and QIAO Shao-Jie
Affiliation:College of Computer Science & Software Engineering, Shenzhen University, Shenzhen,College of Computer Science & Software Engineering, Shenzhen University, Shenzhen,College of Computer Science & Software Engineering, Shenzhen University, Shenzhen,College of Computer Science & Software Engineering, Shenzhen University, Shenzhen,College of Computer Science & Software Engineering, Shenzhen University, Shenzhen and Chengdu University of Information Technology, Chengdu
Abstract:Graph Clustering is a fundamental task for graph mining, which has been widely used in social network analysis related applications.Graph structural clustering (SCAN) is a well-known density-based graph clustering algorithm. The SCAN algorithm can not only find the clusters in a graph, but it also able to identify hub nodes and outliers. However, with the growing graph size, the traditional SCAN algorithm is very hard to handle massive graph data, as its time complexity is O(m1.5) (m is the number of edges in the graph). To overcome the scalability issue of the SCAN algorithm, we propose a MapReduce-based graph structural clustering algorithm, called MRSCAN. Specifically, we develop a MapReduce-based similarity computation, core node computation, as well as two clustering merging algorithms. Finally, we conduct extensive experiments over serval real-world graph datasets, and results demonstrate the accuracy, effectiveness, and scalability of our algorithm.
Keywords:Graph Data  Parallel Computing Model  MapReduce  Structural Graph Clustering
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号