首页 | 本学科首页   官方微博 | 高级检索  
     


Flame-MR: An event-driven architecture for MapReduce applications
Affiliation:1. North China Electric Power University, Beijing, China;2. Shanghai University of Electric Power, Shanghai, China;1. Faculty of Computing and Information Technology, University of Jeddah, Saudi Arabia, 285, Dhahban, 23881, Jeddah, Saudi Arabia;2. Department of Science and Technology University of the Faroe Islands, Denmark;1. College of Applied Computer Science, King Saud University, Saudi Arabia;2. Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Makkah, Saudi Arabia;3. Information Technology Department, Faculty of Computing and Information Technology, King AbdulAziz University, 80200 Jeddah, Saudi Arabia;4. College of Computer and Information Sciences, King Saud University, Saudi Arabia;1. Instituto de Instrumentación para Imagen Molecular (I3M), Centro mixto CSIC – Universitat Politècnica de València – CIEMAT, camino de Vera s/n, 46022 Valencia, Spain;2. Departamento de Construcciones Arquitectónicas, Universitat Politècnica de València, camino de Vera s/n, 46022 Valencia, Spain
Abstract:Nowadays, many organizations analyze their data with the MapReduce paradigm, most of them using the popular Apache Hadoop framework. As the data size managed by MapReduce applications is steadily increasing, the need for improving the Hadoop performance also grows. Existing modifications of Hadoop (e.g., Mellanox Unstructured Data Accelerator) attempt to improve performance by changing some of its underlying subsystems. However, they are not always capable to cope with all its performance bottlenecks or they hinder its portability. Furthermore, new frameworks like Apache Spark or DataMPI can achieve good performance improvements, but they do not keep compatibility with existing MapReduce applications. This paper proposes Flame-MR, a new event-driven MapReduce architecture that increases Hadoop performance by avoiding memory copies and pipelining data movements, without modifying the source code of the applications. The performance evaluation on two representative systems (an HPC cluster and a public cloud platform) has shown experimental evidence of significant performance increases, reducing the execution time by up to 54% on the Amazon EC2 cloud.
Keywords:Big Data  MapReduce  Hadoop  Event-driven architecture  Cloud computing
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号