首页 | 本学科首页   官方微博 | 高级检索  
     


Efficient breadth first search on multi-GPU systems
Authors:Enrico Mastrostefano  Massimo Bernaschi
Affiliation:1. Sapienza University, Rome, Italy;2. Istituto per le Applicazioni del Calcolo, IAC-CNR, Rome, Italy
Abstract:Simple algorithms for the execution of a Breadth First Search on large graphs lead, running on clusters of GPUs, to a situation of load unbalance among threads and un-coalesced memory accesses, resulting in pretty low performances. To obtain a significant improvement on a single GPU and to scale by using multiple GPUs, we resort to a suitable combination of operations to rearrange data before processing them. We propose a novel technique for mapping threads to data that achieves a perfect load balance by leveraging prefix-sum and binary search operations. To reduce the communication overhead, we perform a pruning operation on the set of edges that needs to be exchanged at each BFS level. The result is an algorithm that exploits at its best the parallelism available on a single GPU and minimizes communication among GPUs. We show that a cluster of GPUs can efficiently perform a distributed BFS on graphs with billions of nodes.
Keywords:GPU  CUDA  BFS  Distributed algorithm  Large graphs  Graph 500 benchmark
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号