首页 | 本学科首页   官方微博 | 高级检索  
     


Efficient decomposition of strongly connected components on GPUs
Affiliation:1. School of Computer Science & Technology, Huazhong University of Science & Technology, China;2. School of Mathematics & Computer Science, Wuhan Polytechnic University, China;1. Aix Marseille Université, CNRS, Centrale Marseille, I2M, UMR 7373, 13453 Marseille, France;2. Laboratoire d’Informatique, École polytechnique, France;3. University of Wisconsin-Madison, Department of Mathematics, USA;4. University of Birmingham, School of Mathematics, UK;1. Department of Computer Science and Communication Engineering, Providence University, Taichung, Taiwan;2. Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan;3. Department of Computer Science and Information Management, Providence University, Taichung, Taiwan
Abstract:The GPU (Graphics Processing Unit) has recently become one of the most power efficient processors in embedded and many other environments, and has been integrated into more and more SoCs (System on Chip). Thus modern GPUs play a very important role in power aware computing. Strongly Connected Component (SCC) decomposition is a fundamental graph algorithm which has wide applications in model checking, electronic design automation, social network analysis and other fields. GPUs have been shown to have great potential in accelerating many types of computations including graph algorithms. Recent work have demonstrated the plausibility of GPU SCC decomposition, but the implementation is inefficient due to insufficient consideration of the distinguishing GPU programming model, which leads to poor performance on irregular and sparse graphs.This paper presents a new GPU SCC decomposition algorithm that focuses on full utilization of the contemporary embedded and desktop GPU architecture. In particular, a subgraph numbering scheme is proposed to facilitate the safe and efficient management of the subgraph IDs and to serve as the basis of efficient source selection. Furthermore, we adopt a multi-source partition procedure that greatly reduces the recursion depth and use a vertex labeling approach that can highly optimize the GPU memory access. The evaluation results show that the proposed approach achieves up to 41× speedup over Tarjan’s algorithm, one of the most efficient sequential SCC decomposition algorithms, and up to 3.8× speedup over the previous GPU algorithms.
Keywords:SCC decomposition  GPU  Power efficiency  Parallel algorithms
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号