首页 | 本学科首页   官方微博 | 高级检索  
     


Apollo:Rapidly Picking the Optimal Cloud Configurations for Big Data Analytics Using a Data-Driven Approach
Authors:Yue-Wen Wu  Yuan-Jia Xu  Heng Wu  Lin-Gang Su  Wen-Bo Zhang  Hua Zhong
Affiliation:University of Chinese Academy of Sciences,Beijing 100049,China;State Key Laboratory of Computer Science,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China
Abstract:Big data analytics applications are increasingly deployed on cloud computing infrastructures,and it is still a big challenge to pick the optimal cloud configurations in a cost-effective way.In this paper,we address this problem with a high accuracy and a low overhead.We propose Apollo,a data-driven approach that can rapidly pick the optimal cloud configurations by reusing data from similar workloads.We first classify 12 typical workloads in BigDataBench by characterizing pairwise correlations in our offline benchmarks.When a new workload comes,we run it with several small datasets to rank its key characteristics and get its similar workloads.Based on the rank,we then limit the search space of cloud configurations through a classification mechanism.At last,we leverage a hierarchical regression model to measure which cluster is more suitable and use a local search strategy to pick the optimal cloud configurations in a few extra tests.Our evaluation on 12 typical workloads in HiBench shows that compared with state-of-the-art approaches,Apollo can improve up to 30% search accuracy,while reducing as much as 50% overhead for picking the optimal cloud configurations.
Keywords:big data analytics  cloud configuration  data driven
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学技术学报》浏览原始摘要信息
点击此处可从《计算机科学技术学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号