分布式环境中的多作业执行调度策略与优化 Scheduling and optimization of multi-job execution in distributed environment期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

分布式环境中的多作业执行调度策略与优化

引用本文：	季航旭,姜苏,赵宇海,吴刚,王国仁.分布式环境中的多作业执行调度策略与优化[J].计算机工程与科学,2021,43(6):951-961.

作者姓名：	季航旭姜苏赵宇海吴刚王国仁

作者单位：	(1.东北大学计算机科学与工程学院，辽宁沈阳 110819;2.北京理工大学计算机学院，北京 100081)

基金项目：	科技部重点研发项目（2018YFB1004402）

摘要：	分布式大数据计算引擎是科研机构、互联网企业和政府部门处理大规模数据必不可少的工具,它们的使用和推广促进了各个领域的快速发展,为社会进步做出了巨大贡献.但是,在多作业处理的情况下,目前主流的大数据计算引擎在资源分配和作业调度方面仍有许多不足之处,它们通常对多作业平均划分内存资源并以先进先出FIFO的方式调度作业,这样简单...
关键词：	分布式作业合并聚类轮询调度 Flink
收稿时间：	2020-10-03
修稿时间：	2020-12-30
Scheduling and optimization of multi-job execution in distributed environment

JI Hang-xu,JIANG Su,ZHAO Yu-hai,WU Gang,WANG Guo-ren.Scheduling and optimization of multi-job execution in distributed environment[J].Computer Engineering & Science,2021,43(6):951-961.

Authors:	JI Hang-xu JIANG Su ZHAO Yu-hai WU Gang WANG Guo-ren

Affiliation:	(1.School of Computer Science and Engineering,Northeastern University,Shenyang 110819; 2.School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China)

Abstract:	Distributed big data computing engines are indispensable tools for scientific research institutions, Internet companies, and government departments to process large-scale data. Their use and promotion have promoted the rapid development of various fields and made great contributions to social progress. However, in the case of multi-job processing, the current mainstream big data computing engines still have many shortcomings in resource allocation and job scheduling. They usually divide multi-jobs into memory resources equally and use first-input-first-output (FIFO) method for scheduling jobs, such a simple resource partitioning method and job scheduling mechanism cannot give full play to system performance. In response to this problem, improvements have been made from the job level of the computing engine: (1) in terms of resource division, the task amount of the job is estimated to judge the difference between the task amount and the pre-allocated resources of job, and the jobs with high waste of cluster resources are merged to fully utilize the computing resources by the extraction of job features; (2) in terms of job scheduling, the features of the jobs in the job pool are extracted so that cluster analysis is conducted for the jobs by multipath K-means algorithm, and then self-balancing polling scheduling algorithm is used to schedule the jobs based on the analyzed results to achieve the load balance. In order to verify the effectiveness of the proposed algorithm, comparative experiments were conducted in a distributed cluster environment using large-scale text data sets. The experimental results show that the proposed job merging algorithm and multi-job scheduling algorithm can reduce the job running time by 5% to 23%, improves the system throughput by 7.5%~29%, and reduce the number of threads started by 40% in the best case.

Keywords:	distributed job merging cluster polling scheduling Flink
本文献已被万方数据等数据库收录！
	点击此处可从《计算机工程与科学》浏览原始摘要信息
	点击此处可从《计算机工程与科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏