Investigating Apache Hama: a bulk synchronous parallel computing framework |
| |
Authors: | Kamran Siddique Zahid Akhtar Yangwoo Kim Young-Sik Jeong Edward J. Yoon |
| |
Affiliation: | 1.Dongguk University,Seoul,South Korea;2.University of Quebec,Montreal,Canada;3.Samsung Electronics,Seoul,South Korea |
| |
Abstract: | The quantity of digital data is growing exponentially, and the task to efficiently process such massive data is becoming increasingly challenging. Recently, academia and industry have recognized the limitations of the predominate Hadoop framework in several application domains, such as complex algorithmic computation, graph, and streaming data. Unfortunately, this widely known map-shuffle-reduce paradigm has become a bottleneck to address the challenges of big data trends. The demand for research and development of novel massive computing frameworks is increasing rapidly, and systematic illustration, analysis, and highlights of potential research areas are vital and very much in demand by the researchers in the field. Therefore, we explore one of the emerging and promising distributed computing frameworks, Apache Hama. This is a top level project under the Apache Software Foundation and a pure bulk synchronous parallel model for processing massive scientific computations, e.g. graph, matrix, and network algorithms. The objectives of this contribution are twofold. First, we outline the current state of the art, distinguish the challenges, and frame some research directions for researchers and application developers. Second, we present real-world use cases of Apache Hama to illustrate its potential specifically to the industrial community. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|