首页 | 本学科首页   官方微博 | 高级检索  
     


ActiveSort: Efficient external sorting using active SSDs in the MapReduce framework
Affiliation:1. RAND Corporation, Boston, MA, USA;2. Boston Medical Center, Boston University School of Medicine, Boston, MA, USA
Abstract:In the last decades, there has been an explosion in the volume of data to be processed by data-intensive computing applications. As a result, processing I/O operations efficiently has become an important challenge. SSDs (solid state drives) are an effective solution that not only improves the I/O throughput but also reduces the amount of I/O transfer by adopting the concept of active SSDs. Active SSDs offload a part of the data-processing tasks usually performed in the host to the SSD. Offloading data-processing tasks removes extra data transfer and improves the overall data processing performance.In this work, we propose ActiveSort, a novel mechanism to improve the external sorting algorithm using the concept of active SSDs. External sorting is used extensively in the data-intensive computing frameworks such as Hadoop. By performing merge operations on-the-fly within the SSD, ActiveSort reduces the amount of I/O transfer and improves the performance of external sorting in Hadoop. Our evaluation results on a real SSD platform indicate that the Hadoop applications using ActiveSort outperform the original Hadoop by up to 36.1%. ActiveSort reduces the amount of write by up to 40.4%, thereby improving the lifetime of the SSD.
Keywords:Data-intensive computing  MapReduce  External sorting  Solid state drives  In-storage processing
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号