首页 | 本学科首页   官方微博 | 高级检索  
     

大规模短时间任务的低延迟集群调度框架
引用本文:赵全,汤小春,朱紫钰,毛安琪,李战怀.大规模短时间任务的低延迟集群调度框架[J].计算机应用,2021,41(8):2396-2405.
作者姓名:赵全  汤小春  朱紫钰  毛安琪  李战怀
作者单位:西北工业大学 计算机学院, 西安 710129
基金项目:国家重点研发计划项目(2018YFB1003400)。
摘    要:大规模数据分析环境中,经常存在一些持续时间较短、并行度较大的任务。如何调度这些低延迟要求的并发作业是目前研究的一个热点。现有的一些集群资源管理框架中,集中式调度器由于主节点的瓶颈无法达到低延迟的要求,而一些分布式调度器虽然达成了低延迟的任务调度,但在最优资源分配以及资源分配冲突方面存在一定的不足。从大规模实时作业的需求出发,设计和实现了一个分布式的集群资源调度框架,以满足大规模数据处理的低延迟要求。首先提出了两阶段调度框架以及优化后的两阶段多路调度框架;然后针对两阶段多路调度过程中存在的一些资源冲突问题,提出了基于负载平衡的任务转移机制,从而解决了各个计算节点的负载不平衡问题;最后使用实际负载以及一个模拟调度器对大规模集群中的任务调度框架进行了模拟和验证。对于实际负载,所提框架的调度延迟控制在理想调度的12%以内;在模拟环境下,该框架与集中式调度器相比在短时间任务的延迟上能够减少40%以上。

关 键 词:低延迟  分布式调度  两阶段调度  负载平衡  贪心调度  
收稿时间:2020-10-12
修稿时间:2020-12-11

Low-latency cluster scheduling framework for large-scale short-time tasks
ZHAO Quan,TANG Xiaochun,ZHU Ziyu,MAO Anqi,LI Zhanhuai.Low-latency cluster scheduling framework for large-scale short-time tasks[J].journal of Computer Applications,2021,41(8):2396-2405.
Authors:ZHAO Quan  TANG Xiaochun  ZHU Ziyu  MAO Anqi  LI Zhanhuai
Affiliation:School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710129, China
Abstract:There are always some tasks with short duration and high concurrency in the large-scale data analysis environment. How to schedule these concurrent jobs with low-latency requirement is a hot research topic. In some existing cluster resource management frameworks, the centralized schedulers cannot meet the low-latency requirement due to the bottleneck of the master node, and some distributed schedulers achieve the low-latency task scheduling, but has shortcomings in the optimal resource allocation and resource allocation conflict. By considering the needs for large-scale real-time jobs, a distributed cluster resource scheduling framework was designed and implemented to meet the low-latency requirement of large-scale data processing. Firstly, a two-stage scheduling framework and an optimized two-stage multi-path scheduling framework were proposed. Secondly, aiming at some resource conflict problems in two-stage multi-path scheduling, a task transfer mechanism based on load balancing was proposed to solve the load imbalance problems among computing nodes. At last, the task scheduling framework for large-scale clusters was simulated and verified by using actual load and a simulated scheduler. For the actual load, the scheduling delay of the proposed framework is controlled within 12% of that of the ideal scheduling. In the simulated environment, this framework has the delay of short-time tasks reduced by more than 40% compared with the centralized scheduler.
Keywords:low-latency  distributed scheduling  two-stage scheduling  load balancing  greedy scheduling  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号