A load-balanced parallel sorting algorithm for shared-nothing architectures |
| |
Authors: | Anil Kumar Tony T Lee Vassilis J Tsotras |
| |
Affiliation: | (1) School of Electrical Engineering and Computer Science, Polytechnic University, 11201 Brooklyn, NY |
| |
Abstract: | With the popularity of parallel database machines based on the shared-nothing architecture, it has become important to find external sorting algorithms which lead to a load-balanced computation, i.e., balanced execution, communication and output. If during the course of the sorting algorithm each processor is equally loaded, parallelism is fully exploited. Similarly, balanced communication will not congest the network traffic. Since sorting can be used to support a number of other relational operations (joins, duplicate elimination, building indexes etc.) data skew produced by sorting can further lead to execution skew at later stages of these operations. In this paper we present a load-balanced parallel sorting algorithm for shared-nothing architectures. It is a multiple-input multiple-output algorithm with four stages, based on a generalization of Batcher's odd-even merge. At each stage then keys are evenly distributed among thep processors (i.e., there is no final sequential merge phase) and the distribution of keys between stages ensures against network congestion. There is no assumption made on the key distribution and the algorithm performs equally well in the presence of duplicate keys. Hence our approach always guarantees its performance, as long asn is greater thanp
3, which is the case of interest for sorting large relations. In addition, processors can be added incrementally.
Recommended by: Patrick Valduriez |
| |
Keywords: | Parallel sorting load balancing parallel database machines shared-nothing architectures |
本文献已被 SpringerLink 等数据库收录! |
|