基于调度历史数据在线预测作业执行时间 On-line prediction of application runtimes using schedule historical data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于调度历史数据在线预测作业执行时间

引用本文：	许伦凡,熊敏,肖永浩. 基于调度历史数据在线预测作业执行时间[J]. 计算机应用研究, 2020, 37(3): 763-767

作者姓名：	许伦凡熊敏肖永浩

作者单位：	中国工程物理研究院计算机应用研究所,四川绵阳621900;中国工程物理研究院计算机应用研究所,四川绵阳621900;中国工程物理研究院计算机应用研究所,四川绵阳621900

基金项目：	国家重点研发计划资助项目

摘要：	传统基于用户预估的执行时间通常准确性较差。结合分类和基于实例的学习方法，综合使用模板相似和数值相似方法，在历史调度数据中获取当前作业的相似作业，并使用其历史信息预测当前作业执行时间。使用调度历史中的用户名、分组名、队列名、应用名、用户请求处理器数、用户请求（预估）执行时间和用户请求内存量等属性进行训练和预测，算法中涉及的参数使用遗传算法确定。数值实验表明，相较于已有文献，本方法在使用更少参数的前提下得到了与文献结果中相近的低估率，并获得了更低的平均绝对误差。在HPC2N04和HPC2N05日志数据集上，平均绝对误差分别降低了43%和77%。研究了使用在线预测替换用户估计对作业调度的影响，对结果进行了初步分析并指出了今后的改进方向。
关键词：	执行时间预测作业调度遗传算法 K近邻
收稿时间：	2018-08-17
修稿时间：	2020-02-07
On-line prediction of application runtimes using schedule historical data

Xu Lunfan,Xiong Ming and Xiao Yonghao. On-line prediction of application runtimes using schedule historical data[J]. Application Research of Computers, 2020, 37(3): 763-767

Authors:	Xu Lunfan Xiong Ming Xiao Yonghao

Affiliation:	Institute of Computer Application,China Academy of Engineering Physics,Mianyang Sichuan 621900,,

Abstract:	Traditional runtimes based on user estimating is usually less accurate. This paper combined the categorization with the instance-based learning method, used the template similarity and numerical similarity method to find the similar jobs of the current jobs in historical data, and used historical scheduling data to predict the runtimes of the current jobs. This paper only took seven job attributes into account, which included user name, group name, queue name, application name, requested number of processors, requested runtime, requested memory. It applied genetic algorithm to train the best parameters, and used similar jobs attributes to predict runtimes. Compared with the existing method, experimental results show that the proposed prediction method achieves a similar underestimate rate on the premise of using fewer parameters, and gets a lower mean absolute error. Moreover, on the HPC2N04 and HPC2N05 datasets, the mean absolute errors reduce 43% and 77% respectively. This paper studied the effect of using online prediction to replace user estimation on job scheduling, analyzed the results and pointed out the future improvement directions.

Keywords:	application runtimes prediction job scheduling genetic algorithm K-nearest neighbor
本文献已被万方数据等数据库收录！
	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏