Fast Four‐Way Parallel Radix Sorting on GPUs |
| |
Authors: | Linh Ha Jens Krüger Cláudio T Silva |
| |
Affiliation: | Scientific Computing and Imaging Institute, University of Utah, Salt Lake city, UT, USA {lha@sci.utah.edu, jens@sci.utah.edu, csilva@sci.utah.edu} |
| |
Abstract: | Efficient sorting is a key requirement for many computer science algorithms. Acceleration of existing techniques as well as developing new sorting approaches is crucial for many real‐time graphics scenarios, database systems, and numerical simulations to name just a few. It is one of the most fundamental operations to organize and filter the ever growing massive amounts of data gathered on a daily basis. While optimal sorting models for serial execution on a single processor exist, efficient parallel sorting remains a challenge. In this paper, we present a hardware‐optimized parallel implementation of the radix sort algorithm that results in a significant speed up over existing sorting implementations. We outperform all known General Processing Unit (GPU) based sorting systems by about a factor of two and eliminate restrictions on the sorting key space. This makes our algorithm not only the fastest, but also the first general GPU sorting solution. |
| |
Keywords: | HPC GPGPU GPU sorting collision detection parallel sorting I 3 1: Parallel processing on Graphic Processing Units G 1 0: Parallel algorithms |
|
|