OPTIMA: On-Line Partitioning Skew Mitigation for MapReduce with Resource Adjustment |
| |
Authors: | Zhihong Liu Qi Zhang Raouf Boutaba Yaping Liu Baosheng Wang |
| |
Affiliation: | 1.College of Computer,National University of Defense Technology,Changsha,China;2.David R. Cheriton School of Computer Science,University of Waterloo,Waterloo,Canada;3.Science and Technology on Parallel and Distributed Processing Laboratory,National University of Defense Technology,Changsha,China |
| |
Abstract: | Partitioning skew has been shown to be a major issue that can significantly prolong the execution time of MapReduce jobs. Most of the existing off-line heuristics for partitioning skew mitigation are inefficient; they have to wait for the completion of all the map tasks. Some solutions can tackle this problem on-line, but will impose an additional overhead by repartitioning the workload of overloaded tasks. In this paper, we present OPTIMA, an on-line partitioning skew mitigation technique for MapReduce. OPTIMA predicts the workload distribution of reduce tasks at run-time, leverages the deviation detection technique to identify the overloaded tasks and pro-actively adjusts resource allocation for these tasks to reduce their execution time. We provide the upper bound of OPTIMA in time complexity, while allowing OPTIMA to perform totally on-line. Through experiments using both real and synthetic workloads running on an 11-node Hadoop cluster, we have observed OPTIMA can effectively mitigate the partitioning skew and improved the job completion time by up to 36.73 % in our experiments. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|