首页 | 本学科首页   官方微博 | 高级检索  
     

异构Hadoop集群下的负载自适应反馈调度策略
引用本文:潘佳艺,王芳,杨静怡,谭支鹏.异构Hadoop集群下的负载自适应反馈调度策略[J].计算机工程与科学,2017,39(3):413-423.
作者姓名:潘佳艺  王芳  杨静怡  谭支鹏
作者单位:(1.华中科技大学武汉光电国家实验室,湖北 武汉 430074;2.华中科技大学计算机科学与技术学院,湖北 武汉 430074; 3.华中科技大学信息存储系统教育部重点实验室,湖北 武汉 430074)
基金项目:国家863计划(2013AA013203)
摘    要:随着基于Hadoop平台的大数据技术的不断发展和实践的深入,Hadoop YARN资源调度策略在异构集群中的不适用性越发明显。一方面,节点资源无法动态分配,导致优势节点的计算资源浪费、系统性能没有充分发挥;另一方面,现有的静态资源分配策略未考虑作业在不同执行阶段的差异,易产生大量资源碎片。基于以上问题,提出了一种负载自适应调度策略。监控集群执行节点和提交作业的性能信息,利用实时监控数据建模、量化节点的综合计算能力,结合节点和作业的性能信息在调度器上启动基于相似度评估的动态资源调度方案。优化后的系统能够有效识别集群节点的执行能力差异,并根据作业任务的实时需求进行细粒度的动态资源调度,在完善YARN现有调度语义的同时,可作为子级资源调度方案架构在上层调度器下。在Hadoop 2.0上实现并测试该策略,实验结果表明,作业的自适应资源调度策略显著提高了资源利用率,集群并发度提高了2到3倍,时间性能提升了近10%。

关 键 词:异构集群  监控  计算能力  动态调度  负载自适应
收稿时间:2016-09-03
修稿时间:2017-03-25

A load adaptive feedback scheduling strategy for heterogeneous Hadoop cluster
PAN Jia yi,WANG Fang,YANG Jing yi,TAN Zhi peng.A load adaptive feedback scheduling strategy for heterogeneous Hadoop cluster[J].Computer Engineering & Science,2017,39(3):413-423.
Authors:PAN Jia yi  WANG Fang  YANG Jing yi  TAN Zhi peng
Affiliation:(1.Wuhan National Lab for Optoelectronics,Huazhong Uniuersity of Science and Technology,Wuhan 430074; 2.School of Computer Science & Technology,Huazhong Uniuersity of Science and Technology,Wuhan 430074; 3.Key Laboratoryof Information Storage System,Ministry of Education, Huazhong Uniuersity of Science and Technology,Wuhan 430074,China)
Abstract:With the development and practice of big data technology, Hadoop YARN (Yet Anouther Resource Negotiator) scheduler is no longer an effective solution in heterogeneous cluster environment. On the one hand, YARN cannot dynamically allocate the resources of nodes, which leads to a waste of better nodes’ resources and poor overall system performance. On the other hand, YARN’s existing static resource allocation policy ignores the difference of the different stages, which causes a large number of resource fragments. Aiming at the above problems, we put forward a load adaptive feedback scheduling strategy. The system monitors the performance of all nodes and jobs, evaluates the computing power of each node with the real time monitoring data. Then the scheduler starts the dynamic resource scheduling strategy based on the similarity assessment together with the monitoring information of nodes and jobs’ performance. The optimized system can distinguish the heterogeneity of different nodes, allocate resources for tasks’ real time needs dynamically, refine YARN’s scheduling semantics and be used as a secondary resource scheduling strategy of the upper scheduler. We implement and test the strategy on Hadoop 2.0, and the experimental results show that this scheduling strategy can significantly improve the utilization rate of resources, improve the cluster’s concurrency by 2 to 3 times, and enhance the performance by nearly 10%.
Keywords:heterogeneous cluster  monitor  computing power  dynamic scheduling  load adaptive  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号