期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

王超陈刚黄刚王彪《计算机与网络》2021,47(8):67-73

传统经典作业度算法在集群应用中实现简单、执行效率高,但在异构集群环境下由于缺乏在线节点运行状态动态反馈能力和负载均衡能力,降低了计算资源利用率和系统吞吐率.为解决上述问题,设计了一种在异构集群环境下基于主机性能度量的作业负载均衡调度算法,该算法通过收集集群中在线节点的状态信息和作业响应时间遴选出可信节点集合,计算出各可信节点的HPM值,利用负载均衡运算规则生成候选的作业分配节点集合,最终按照预先设计的优先原则把不同作业分配至各计算节点,并更新各节点运行状态.实验结果表明,在异构集群环境下调度同类型作业时,该算法在总完成时间和负载均衡性能等指标上均优于传统经典算法. 相似文献

2.

异构总线网络中实时可分性负载调度算法*

卢建斌肖慧胡卫东郁文贤《计算机应用研究》2008,25(3):729-731

针对异构总线网络提出了一种动态实时可分性负载调度方法.首先,根据可分性负载调度最优性原理,分析了网络中处理器负载分配的最优次序以及参与计算的处理器数目;然后,针对实时任务的截止期限约束提出一种动态负载分配算法,该算法可以利用网络中最少的处理器数目,保证实时任务在其截止期限之前计算完成.理论分析和仿真测试都验证了所提出算法的有效性. 相似文献

3.

面向国产申威26010众核处理器的SpMV实现与优化

刘芳芳杨超袁欣辉吴长茂敖玉龙《软件学报》2018,29(12):3921-3932

世界首台峰值性能超过100P的超级计算机——神威太湖之光已经研制完成,该超级计算机采用了国产申威异构众核处理器,该处理器不同于现有的纯CPU,CPU-MIC,CPU-GPU架构,采用了主-从核架构,单处理器峰值计算能力为3TFlops/s,访存带宽为130GB/s.稀疏矩阵向量乘SpMV（sparse matrix-vector multiplication）是科学与工程计算中的一个非常重要的核心函数,众所周知,其是带宽受限型的,且存在间接访存操作.国产申威处理器给稀疏矩阵向量乘的高效实现带来了很大的挑战.针对申威处理器提出了一种CSR格式SpMV操作的通用异构众核并行算法,该算法从任务划分、LDM空间划分方面进行精细设计,提出了一套动静态buffer的缓存机制以提升向量x的访存命中率,提出了一套动静态的任务调度方法以实现负载均衡.另外还分析了该算法中影响SpMV性能的几个关键因素,并开展了自适应优化,进一步提升了性能.采用Matrix Market矩阵集中具有代表性的16个稀疏矩阵进行了测试,相比主核版最高有10倍左右的加速,平均加速比为6.51.通过采用主核版CSR格式SpMV的访存量进行分析,测试矩阵最高可达该处理器实测带宽的86%,平均可达到47%. 相似文献

4.

异构负载下单机调度与预测性维修的集成建模

甘婕舒坦石慧赵春晓《控制与决策》2024,39(3):1003-1011

在生产调度的过程中,设备常常因加工不同作业而承受不同负载即异构负载,设备受异构负载的影响导致其加工每项作业过程中的退化速率不同,从而影响生产调度与维修计划的排程,进而带来资源闲置和时间成本增加的问题.为了解决该问题,在考虑异构负载影响下,提出单机调度与预测性维修的联合策略,以最小总加权期望完成时间为目标构建相应的集成模型.对单机调度过程中受异构负载影响的设备,建立基于维纳过程的退化模型,根据其退化规律,推导相应设备剩余寿命的累积分布函数.通过数值实验,分别针对异构负载与平均负载的情况比较相应集成模型的优化结果,结果表明了在集成模型中考虑异构负载的必要性,并通过参数灵敏度分析验证了所建集成模型的有效性. 相似文献

5.

基于ECT的优先权约束作业调度策略研究

彭滢刘辉林杰《计算机工程与应用》2007,43(33):129-132

为了协调网格计算中异构资源在多用户之间的合理共享,满足不同用户需求,该文提出一种基于ECT的优先权约束作业调度策略。该策略充分考虑不同作业的期望完成时间,并通过为不同级别用户设置优先级,使得高优先权用户的作业优先执行,保证绝大多数作业在期望完成时间之内完成,同时平衡了各种资源的利用率。该策略解决了网格环境下不同类别用户无冲突共享资源问题,提高了用户满意程度,实现了作业与异构资源之间的合理匹配。相似文献

6.

面向长作业环境中的云调度策略

蒋维成李兰英郭俊徐草草《计算机工程与科学》2017,39(8):1431-1437

随着云计算的普及,大量的数据处理选择云服务来完成。现有算法较少考虑异构型系统中虚拟机计算能力的不同,导致某些任务等待时间过长。提出了虚拟机负载大小实时调整的算法。对云计算中资源虚拟化特征,给出一种评估虚拟机计算能力的方法。根据虚拟机能力和运行过程中的状态变化,自适应进行任务量大小调整,满足实时要求。通过任务调度,协调任务完成时间,保持各虚拟机负载的动态均衡,缩短长作业的总执行时间,提高了系统的吞吐量和整体服务能力,提升了效益。实验结果表明,本文算法能自适应地调整任务量大小,进行调度,以维持虚拟机负载均衡。相似文献

7.

面向CPU-GPU异构系统的数据分析负载均衡策略

孙婷婷黄皓王嘉伦翁楚良《计算机工程与科学》2019,41(3):417-423

应用于高性能计算领域的通用GPU拥有强大的并行计算能力,以通用GPU作为主处理器的数据分析系统相较于传统数据库能够提供更好的性能。在大数据场景下,如何根据CPU和GPU的资源在处理器之间合理分配工作负载是亟待解决的问题。提出了一种CPU GPU异构数据分析系统上的负载均衡处理策略。该策略采用流水线模型将工作负载分解,基于流水线设计了负载均衡模型,将工作负载合理分配至异构处理器,减少系统总执行时间开销,实现了性能提升。实验结果表明,提出的基于流水线的负载均衡模型能适应不同查询请求下的不同数据量场景,具有良好的性能。相似文献

8.

异构环境下改进的LATE调度算法

王少娟《计算技术与自动化》2016,(4):66-70

针对异构环境下LATE算法在选择备份任务及执行节点时的不足,提出一个改进的IR-LATE调度算法。算法通过计算为剩余完成时间最长、最需要备份的慢任务启动备份,并将其按负载不同进行分类,结合轮询算法,将备份任务分配到负载最小且成功/负载比高的节点上执行。实验结果表明,该算法与LATE算法比较,有效的将作业完成时间缩短了30%左右,提高了执行效率,进而促进系统的负载均衡。相似文献

9.

异构HPL算法中CPU端高性能BLAS库优化

蔡雨孙成国杜朝晖刘子行康梦博李双双《软件学报》2021,32(8):2289-2306

异构HPL（high-performance Linpack）效率的提高需要充分发挥加速部件和通用CPU计算能力,加速部件集成了更多的计算核心,负责主要的计算,通用CPU负责任务调度的同时也参与计算.在合理划分任务、平衡负载的前提下,优化CPU端计算性能对整体效率的提升尤为重要.针对具体平台体系结构特点对BLAS（basic linear algebra subprograms）函数进行优化往往可以更加充分地利用通用CPU计算能力,提高系统整体效率.BLIS（BLAS-like library instantiation software）算法库是开源的BLAS函数框架,具有易开发、易移植和模块化等优点.基于异构系统平台体系结构以及HPL算法特点,充分利用三级缓存、向量化指令和多线程并行等技术手段优化CPU端调用的各级BLAS函数,应用auto-tuning技术优化矩阵分块参数,从而形成了HygonBLIS算法库.与MKL相比,在异构环境下,HPL算法整体性能提高了11.8%. 相似文献

10.

异构HPL算法中CPU端高性能BLAS库优化

蔡雨孙成国杜朝晖刘子行康梦博李双双《软件学报》2020,31(7)

异构HPL（High-performance Linpack）效率的提高需要充分发挥加速部件和通用CPU计算能力,加速部件集成了更多的计算核心,负责主要的计算,通用CPU负责任务调度的同时也参与计算.在合理划分任务,平衡负载的前提下,优化CPU端计算性能对整体效率的提升尤为重要.针对具体平台体系结构特点对BLAS（Basic linear Algebra Subprograms）函数进行优化往往可以更加充分的利用通用CPU计算能力,提高系统整体效率.BLIS（BLAS-like Library Instantiation Software）算法库是开源的BLAS函数框架,具有易开发、易移植和模块化等优点.本文基于异构系统平台体系结构以及HPL算法特点,充分利用三级缓存、向量化指令和多线程并行等技术手段优化CPU端调用的各级BLAS函数,应用auto-tuning技术优化矩阵分块参数,从而形成了HygonBLIS算法库,与MKL相比,异构环境下HPL整体性能提高了11.8%. 相似文献

11.

Adaptive hierarchical scheduling policy for enterprise grid computing systems

J.H. Abawajy 《Journal of Network and Computer Applications》2009,32(3):770-779

In an enterprise grid computing environments, users have access to multiple resources that may be distributed geographically. Thus, resource allocation and scheduling is a fundamental issue in achieving high performance on enterprise grid computing. Most of current job scheduling systems for enterprise grid computing provide batch queuing support and focused solely on the allocation of processors to jobs. However, since I/O is also a critical resource for many jobs, the allocation of processor and I/O resources must be coordinated to allow the system to operate most effectively. To this end, we present a hierarchical scheduling policy paying special attention to I/O and service-demands of parallel jobs in homogeneous and heterogeneous systems with background workload. The performance of the proposed scheduling policy is studied under various system and workload parameters through simulation. We also compare performance of the proposed policy with a static space–time sharing policy. The results show that the proposed policy performs substantially better than the static space–time sharing policy. 相似文献

12.

Analysis of fork-join program response times on multiprocessors

Towsley D. Rommel C.G. Stankovic J.A. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(3):286-303

Models for two processor sharing policies called task scheduling processor sharing and job scheduling processor sharing are developed and analyzed. The first policy schedules each task independently and allows parallel execution of an individual program, whereas the second policy schedules each job as a unit, thereby not allowing parallel execution of an individual program. It is found that task scheduling performs better than job scheduling for most system parameter values. The performance of the task scheduling processor sharing is compared to a first come first serve policy. First come first serve performs better than processor sharing over a wide range of system parameters. Processor sharing performs best when the task service time variability is high. The performance of processor sharing and first come first serve is studied with two classes of jobs, and for when a specific number of processors is statically assigned to each of the classes 相似文献

13.

Extra Processors versus Future Information in Optimal Deadline Scheduling

Chiu-Yuen Koo Tak-Wah Lam Tsuen-Wan "Johnny" Ngan Kar-Keung To 《Theory of Computing Systems》2004,37(3):323-341

This paper is concerned with the design of online scheduling algorithms that exploit extra resources. In particular, it studies how to make use of multiple processors to counteract the lack of future information in online deadline scheduling. Our results extend the previous work that are primarily based on using a faster processor to obtain a performance guarantee. The challenge arises from the fact that jobs are sequential in nature and cannot be executed on more than one processor at the same time. Thus, a faster processor can speed up a job while multiple unit-speed processors cannot. 相似文献

14.

Cloud data analysis using a genetic algorithm‐based job scheduling process

J. Frank Vijay 《Expert Systems》2019,36(5)

Data analysis plays a major role in different research applications that require a large volume of data. Cloud computing can provide computer processing resources and device‐to‐device data sharing based on user requirements. The main goal of cloud computing is to allow users and enterprise of varying capabilities to store and process data in an efficient way and to access and distribute resources. However, a crucial problem in cloud computing is job scheduling for numerous users. Prior to the implementation of job scheduling, jobs must be categorized according to degree of criticalness, privacy and time required. Based on the experimental results, the combination of tasks was successfully determined by the processor. In heterogeneous multiprocessor systems, customized job scheduling is highly critical for obtaining optimal job performance. In this paper, an evolutionary genetic algorithm was used for obtaining better results in job scheduling, thereby improving performance in the cloud system in this regard. The genetic algorithm‐based job scheduling process introduced minimizes the investment in time through effective allocation of user requests in order to enhance the overall efficiency of the system. 相似文献

15.

网络集群计算系统中的并行任务调度 总被引：12，自引：0，他引：12

黄金贵陈建二陈松乔《计算机学报》2004,27(6):765-771

基于多处理机并行任务调度模型，探讨网络集群计算系统中的并行任务调度问题，首先证明了一般网络集群计算系统中调度算法的可近似性难度，然后提出了三种不同的启发式算法：最大长度优先调度算法、最大宽度优先调度算法和最大面积优先调度算法；然后根据大量的模拟实验对这些算法以及文献中已提出的调度算法进行了比较分析，结果表明该文的启发式算法比文献中的算法在性能上效果更好。相似文献

16.

Distributed quantum entanglement sharing model for high-performance real-time system

Chi-Yuan Chen Yao-Hsin Chou Han-Chieh Chao 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2012,16(3):427-435

Two processors jointly provide a real-time service which can be completed by exactly one processor. Assuming each processor is allowed to announce only a one-bit information in a distributed way to decide which one should process the job, inevitably some of the jobs will get lost if only classical resources are used. In this paper, we proposed the distributed quantum entanglement sharing (DQES) model to share quantum entanglement with processors. Assisted with DQES model, not only the system dependability can be enhanced, but the faulty processor can also be identified. We also presented some possible applications such like database consistency, job scheduling, system dependability, and reliable communication protocols. 相似文献

17.

Analysis of processor allocation in multiprogrammed,distributed-memory parallel processing systems

Setia S.K. Squillante M.S. Tripathi S.K. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(4):401-420

A main objective of scheduling independent jobs composed of multiple sequential tasks in shared-memory and distributed-memory multiprocessor computer systems is the assignment of these tasks to processors in a manner that ensures efficient operation of the system. Achieving this objective requires the analysis of a fundamental tradeoff between maximizing parallel execution, suggesting that the tasks of a job be spread across all system processors, and minimizing synchronization and communication overheads, suggesting that the job's tasks be executed on a single processor. The authors consider a class of scheduling policies that represent the essential aspects of this processor allocation tradeoff, and model the system as a distributed fork-join queueing system. They derive an approximation for the expected job response time, which includes the important effects of various parallel processing overheads (such as task synchronization and communication) induced by the processor allocation policy 相似文献

18.

Scheduling a batch processing machine with incompatible job families 总被引：6，自引：0，他引：6

Meral Azizoglu Scott Webster 《Computers & Industrial Engineering》2001,39(3-4):325-335

The problem of scheduling batch processors is important in some industries and, at a more fundamental level, captures an element of complexity common to many practical scheduling problems. We describe a branch and bound procedure applicable to a batch processor model with incompatible job families. Jobs in a given family have identical job processing times, arbitrary job weights, and arbitrary job sizes. Batches are limited to jobs from the same family. The scheduling objective is to minimize total weighted completion time. We find that the procedure returns optimal solutions to problems of up to about 25 jobs in reasonable CPU time, and can be adapted for use as a heuristic for larger problems. 相似文献

19.

Optimal Load Balancing in a Multiple Processor System with Many Job Classes

《IEEE transactions on pattern analysis and machine intelligence》1985,(5):491-496

A loosely coupled multiprocessor system contains multiple processors which have their own local memories. To balance the load among multiple processors is of fundamental importance in enhancing the performance of such a multiple processor system. Probabilistic load balancing in a heterogeneous multiple processor system with many job classes is considered in this study. The load balancing scheme is formulated as a nonlinear programming problem with linear constraints. An optimal probabilistic load balancing algorithm is proposed to solve this nonlinear programming problem. The proposed load balancing method is proven globally optimum in the sense that it results in a minimum overall average job response time on a probabilistic basis. 相似文献

20.

一种双匹配动态调度算法 总被引：6，自引：0，他引：6

支青蒋昌俊《信息与控制》2005,34(5):532-538

提出了适于异构环境独立任务调度的双匹配动态调度算法（BM算法）．BM算法将任务与处理机实现双匹配，使大部分任务在执行时间最短而且完成时间最早的处理机上执行．对于无法实现双匹配的任务，采用最早完成时间最小者优先的策略进行调度．BM算法可以同时满足负载均衡和高吞吐率两个目标．BM算法与通常用作评测基准的Min-min算法的比较结果表明，BM算法的运行时间远少于Min-min算法，其调度跨度比Min-min算法减少约9％．相似文献