首页 | 本学科首页   官方微博 | 高级检索  
     

基于关键阶段分析的Spark性能预测模型
引用本文:葛庆宝,陶耀东,高岑,田月,孟祥茹.基于关键阶段分析的Spark性能预测模型[J].计算机系统应用,2018,27(8):232-236.
作者姓名:葛庆宝  陶耀东  高岑  田月  孟祥茹
作者单位:中国科学院大学, 北京 100049;中国科学院 沈阳计算技术研究所, 沈阳 110168,中国科学院 沈阳计算技术研究所, 沈阳 110168,中国科学院 沈阳计算技术研究所, 沈阳 110168,中国科学院 沈阳计算技术研究所, 沈阳 110168,中国科学院 沈阳计算技术研究所, 沈阳 110168
摘    要:Spark作为目前大数据处理领域广泛使用的计算平台,合理分配集群资源对Spark作业性能优化有着重要的作用.性能预测是集群资源分配优化的基础和关键,本文正是基于此提出了一种Spark性能预测模型.文中选取作业执行时间作为Spark性能衡量指标,提出了Spark作业关键阶段的概念,通过运行小批量数据集来获取关键阶段的运行时间和作业输入数据量之间关系,从而构建了Spark性能预测模型.实验结果表明该模型较为有效.

关 键 词:Spark  资源分配  性能预测  关键阶段
收稿时间:2017/12/8 0:00:00
修稿时间:2018/1/4 0:00:00

Performance Prediction Model for Spark Based on Key Stages Analysis
GE Qing-Bao,TAO Yao-Dong,GAO Cen,TIAN Yue and MENG Xiang-Ru.Performance Prediction Model for Spark Based on Key Stages Analysis[J].Computer Systems& Applications,2018,27(8):232-236.
Authors:GE Qing-Bao  TAO Yao-Dong  GAO Cen  TIAN Yue and MENG Xiang-Ru
Affiliation:University of Chinese Academy of Sciences, Beijing 100049, China;Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China,Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China,Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China,Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China and Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China
Abstract:Spark is widely used as a computing platform for large data processing, reasonable allocation of cluster resources plays an important role in the operation of Spark performance optimization. The performance prediction is the basis and key of cluster resource allocation optimization, thus we put forward a Spark performance prediction model in this paper. This paper selects the job execution time as a measure indicator of Spark performance, and put forward the concept of key Stage of Spark job. Finally, we built the model by analyzing relationships between the key Stages and the amount of input data through running a small quantity of data. The experimental results show that the model is effective
Keywords:Spark  resource allocation  performance prediction  key stages
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号