首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
汤小春  赵全  符莹  朱紫钰  丁朝  胡小雪  李战怀 《软件学报》2022,33(12):4704-4726
Dataflow模型的使用,使得大数据计算的批处理和流处理融合为一体.但是,现有的针对大数据计算的集群资源调度框架,要么面向流处理,要么面向批处理,不适合批处理与流处理作业共享集群资源的需求.另外,GPU用于大数据分析计算时,由于缺乏有效的CPU-GPU资源解耦方式,降低了资源使用效率.在分析现有的集群资源调度框架的基础上,设计并实现了一种可以感知批处理/流处理应用的混合式资源调度框架HRM.它以共享状态架构为基础,采用乐观封锁协议和悲观封锁协议相结合的方式,确保流处理作业和批处理作业的不同资源要求.在计算节点上,提供CPU-GPU资源的灵活绑定,采用队列堆叠技术,不但满足流处理作业的实时性需求,也减少了反馈延迟并实现了GPU资源的共享.通过模拟大规模作业的调度,结果显示, HRM的调度延迟只有集中式调度框架的75%左右;使用实际负载测试,批处理与流处理共享集群时,使用HRM调度框架, CPU资源利用率提高25%以上;而使用细粒度作业调度方法,不但GPU利用率提高2倍以上,作业的完成时间也能够减少50%左右.  相似文献   

2.
Adaptive Executions of Multi-Physics Coupled Applications on Batch Grids   总被引:1,自引:0,他引:1  
Long running multi-physics coupled parallel applications have gained prominence in recent years. The high computational requirements and long durations of simulations of these applications necessitate the use of multiple systems of a Grid for execution. In this paper, we have built an adaptive middleware framework for execution of long running multi-physics coupled applications across multiple batch systems of a Grid. Our framework, apart from coordinating the executions of the component jobs of an application on different batch systems, also automatically resubmits the jobs multiple times to the batch queues to continue and sustain long running executions. As the set of active batch systems available for execution changes, our framework performs migration and rescheduling of components using a robust rescheduling decision algorithm. We have used our framework for improving the application throughput of a foremost long running multi-component application for climate modeling, the Community Climate System Model (CCSM). Our real multi-site experiments with CCSM indicate that Grid executions can lead to improved application throughput for climate models.  相似文献   

3.
针对网络拥塞控制系统中因网络时滞对主动队列管理算法产生的不利影响, 提出了一种基于Smith预估的自适应模糊主动队列管理算法。该算法将Smith预估控制与自适应模糊控制相结合, 利用Smith预估器补偿网络时滞, 同时运用模糊控制在一定程度上克服了传统Smith预估器对模型结构与参数的精确性过于敏感、鲁棒性差等缺点。仿真结果表明, 该方法可以使队列长度快速收敛到设定值, 同时维持较小的队列振荡, 尤其是在网络条件变化的情况下, 该算法优于传统PI控制、模糊控制和传统的滑模控制。  相似文献   

4.
Allocating submeshes to jobs in mesh-connected multicomputers in an FCFS fashion leads to poor system performance because a large job at the head of the waiting queue can prevent the allocation of free submeshes to other smaller waiting jobs. However, serving jobs aggressively out-of-order can lead to excessive waiting delays for large jobs located at the head of the waiting queue. In this paper, we show that the ability of the job scheduling algorithm to bypass the head of the waiting queue should increase with the load, and we propose a scheduling scheme that can bypass the waiting queue head in a load-dependent adaptive fashion. Also, giving priority to large jobs because they are more difficult to accommodate is investigated. The performance of the proposed scheme has been compared to that of FCFS, aggressive out-of-order scheduling, and other previous job scheduling schemes. Extensive simulation results based on synthetic workloads and real workload traces indicate that our scheduling strategy is a good strategy when both average and maximum job waiting delays are considered. In particular, it is substantially superior to FCFS in terms of mean turnaround times, and to aggressive out-of-order scheduling in terms of maximum waiting delays.  相似文献   

5.
This paper considers a scheduling problem for parallel burn-in ovens in the semiconductor manufacturing industry. An oven is a batch processing machine with restricted capacity. The batch processing time is set by the longest processing time among those of all the jobs contained in the batch. All jobs are assumed to have the same due date. The objective is to minimize the sum of the absolute deviations of completion times from the due date (earliness–tardiness) of all jobs. We suggest three decomposition heuristics. The first heuristic applies the exact algorithm due to Emmons and Hall (for the nonbatching problem) in order to assign the jobs to separate early and tardy job sets for each of the parallel burn-in ovens. Then, we use job sequencing rules and dynamic programming in order to form batches for the early and tardy job sets and sequence them optimally. The second proposed heuristic is based on genetic algorithms. We use a genetic algorithm in order to assign jobs to each single burn-in oven. Then, after forming early and tardy job sets for each oven we apply again sequencing rules and dynamic programming techniques to the early and tardy jobs sets on each single machine in order to form batches. The third heuristic assigns jobs to the m early job sets and m tardy jobs sets in case of m burn-in ovens in parallel via a genetic algorithm and applies again dynamic programming and sequencing rules. We report on computational experiments based on generated test data and compare the results of the heuristics with known exact solution for small size test instances obtained from a branch and bound scheme.  相似文献   

6.
We study online adaptive scheduling for multiple sets of parallel jobs, where each set may contain one or more jobs with time-varying parallelism. This two-level scheduling scenario arises naturally when multiple parallel applications are submitted by different users or user groups in large parallel systems, where both user-level fairness and system-wide efficiency are of important concerns. To achieve fairness, we use the well-known equi-partitioning algorithm to distribute the available processors among the active job sets at any time. For efficiency, we apply a feedback-driven adaptive scheduler that periodically adjusts the processor allocations within each set by consciously exploiting the jobs’ execution history. We show that our algorithm achieves asymptotically competitive performance with respect to the set response time, which incorporates two widely used performance metrics, namely, total response time and makespan, as special cases. Both theoretical analysis and simulation results demonstrate that our algorithm improves upon an existing scheduler that provides only fairness but lacks efficiency. Furthermore, we provide a generalized framework for analyzing a family of scheduling algorithms based on feedback-driven policies with provable efficiency. Finally, we consider an extended multi-level hierarchical scheduling model and present a fair and efficient solution that effectively reduces the problem to the two-level model.  相似文献   

7.
In this work, we evaluate the benefits of using Grids with multiple batch systems to improve the performance of multi-component and parameter sweep parallel applications by reduction in queue waiting times. Using different job traces of different loads, job distributions and queue waiting times corresponding to three different queuing policies (FCFS, conservative and EASY backfilling), we conducted a large number of experiments using simulators of two important classes of applications. The first simulator models Community Climate System Model (CCSM), a prominent multi-component application and the second simulator models parameter sweep applications. We compare the performance of the applications when executed on multiple batch systems and on a single batch system for different system and application configurations. We show that there are a large number of configurations for which application execution using multiple batch systems can give improved performance over execution on a single system.  相似文献   

8.
The utilization of parallel computers depends on how jobs are packed together: if the jobs are not packed tightly, resources are lost due to fragmentation. The problem is that the goal of high utilization may conflict with goals of fairness or even progress for all jobs. The common solution is to use backfilling, which combines a reservation for the first job in the interest of progress with packing of later jobs to fill in holes and increase utilization. However, backfilling considers the queued jobs one at a time, and thus might miss better packing opportunities. We propose the use of dynamic programming to find the best packing possible given the current composition of the queue, thus maximizing the utilization on every scheduling step. Simulations of this algorithm, called lookahead optimizing scheduler (LOS), using trace files from several IBM SP parallel systems, show that LOS indeed improves utilization, and thereby reduces the mean response time and mean slowdown of all jobs. Moreover, it is actually possible to limit the lookahead depth to about 50 jobs and still achieve essentially the same results. Finally, we experimented with selecting among alternative sets of jobs that achieve the same utilization. Surprising results indicate that choosing the set at the head of the queue does not necessarily guarantee best performance. Instead, repeatedly selecting the set with the maximal overall expected slowdown boosts performance when compared to all other alternatives checked.  相似文献   

9.
结合回填的FCFS策略是超级计算机上使用最为普遍的调度策略,针对该策略在响应时间和系统利用率等方面的不足,提出了改进其性能的DGA方法。该方法利用并行作业的可塑性,通过调度时对作业平均响应时间的预测来选择适合的作业请求规模,并利用遗传算法来解决最优作业资源请求的搜索问题。模拟器上实际作业流的模拟结果表明:该方法可以显著地改进结合回填的FCFS策略的调度效果,也优于已有的可塑性作业调度策略。  相似文献   

10.
提出一种基于灰色预测的智能 PID(GI-PID)主动队列管理(AQM)算法,该算法采用 GM(1,1)模型在线预测路由器队列长度,补偿滞后以解决网络状况反馈不及时的问题;同时根据队列误差的变化趋势,应用专家经验动态改变 PID 控制器的参数,使参数实时地随着网络环境变化而调整,实现智能控制.仿真试验表明,GI-PID 算法相比传统 PID 算法大幅度地抑制了队列长度的振荡,路由器队列收敛于期望值,同时具有较小的分组丢弃概率.  相似文献   

11.
为了综合控制拥塞链路的队列长度,提高AQM系统对动态网络环境的自适应能力,提出了一种基于灰色预测和考虑可变裕度PID控制的自适应TCP网络主动队列管理。首先,建立相角和幅值裕度与网络参数相关的PID自适应主动队列(TCP/AQM)控制论模型,该模型可以根据网络参数的变化而动态改变控制参数,以提高AQM网络动态自适应能力,及系统的鲁棒性;其次,将灰色预测引入该模型,实现路由器队列长度的超前预测,补偿带有PID反馈模块的AQM算法给队列造成的时滞影响。与其他算法的仿真结果相比较,该设计算法能够使信息流在较短的时间内稳定在期望队列长度阈值附近。  相似文献   

12.
This paper addresses the problem of minimizing the scheduling length (make-span) of a batch of jobs with different arrival times. A job is described by a direct acyclic graph (DAG) of parallel tasks. The paper proposes a dynamic scheduling method that adapts the schedule when new jobs are submitted and that may change the processors assigned to a job during its execution. The scheduling method is divided into a scheduling strategy and a scheduling algorithm. We also propose an adaptation of the Heterogeneous Earliest-Finish-Time (HEFT) algorithm, called here P-HEFT, to handle parallel tasks in heterogeneous clusters with good efficiency without compromising the makespan. The results of a comparison of this algorithm with another DAG scheduler using a simulation of several machine configurations and job types shows that P-HEFT gives a shorter makespan for a single DAG but scores worse for multiple DAGs. Finally, the results of the dynamic scheduling of a batch of jobs using the proposed scheduler method showed significant improvements for more heavily loaded machines when compared to the alternative resource reservation approach.  相似文献   

13.
A bulk arrival M/sup x//M/c queuing system is used to model a centralized parallel processing system with job splitting. In such a system, jobs wait in a central queue, which is accessible by all the processors, and are split into independent tasks that can be executed on separate processors. The job response-time consists of three components: queuing delay, service time, and synchronization delay. An expression for the mean job response-time is obtained for this centralized parallel-processing system. Centralized and distributed parallel-processing systems (with and without job-splitting) are considered and their performances compared. Furthermore, the effects of parallelism and overheads due to job-splitting are investigated.<>  相似文献   

14.
This research analyzes the problem of scheduling a set of n jobs with arbitrary job sizes and non-zero ready times on a set of m unrelated parallel batch processing machines so as to minimize the makespan. Unrelated parallel machine is a generalization of the identical parallel processing machines and is closer to real-world production systems. Each machine can accommodate and process several jobs simultaneously as a batch as long as the machine capacity is not exceeded. The batch processing time and the batch ready time are respectively equal to the largest processing time and the largest ready time among all the jobs in the batch. Motivated by the computational complexity and the practical relevance of the problem, we present several heuristics based on first-fit and best-fit earliest job ready time rules. We also present a mixed integer programming model for the problem and a lower bound to evaluate the quality of the heuristics. The small computational effort of deterministic heuristics, which is valuable in some practical applications, is also one of the reasons that motivates this study. The results show that the heuristic proposed in this paper has a superior performance compared to the heuristics based on ideas proposed in the literature.  相似文献   

15.
We study a supply chain scheduling problem in which n jobs have to be scheduled on a single machine and delivered to m customers in batches. Each job has a due date, a processing time and a lateness penalty (weight). To save batch-delivery costs, several jobs for the same customer can be delivered together in a batch, including late jobs. The completion time of each job in the same batch coincides with the batch completion time. A batch setup time has to be added before processing the first job in each batch. The objective is to find a schedule which minimizes the sum of the weighted number of late jobs and the delivery costs. We present a pseudo-polynomial algorithm for a restricted case, where late jobs are delivered separately, and show that it becomes polynomial for the special cases when jobs have equal weights and equal delivery costs or equal processing times and equal setup times. We convert the algorithm into an FPTAS and prove that the solution produced by it is near-optimal for the original general problem by performing a parametric analysis of its performance ratio.  相似文献   

16.
This paper presents several search heuristics and their performance in batch scheduling of parallel, unrelated machines. Identical or similar jobs are typically processed in batches in order to decrease setup times and/or processing times. The problem accounts for allotting batched work parts into unrelated parallel machines, where each batch consists of a fixed number of jobs. Some batches may contain different jobs but all jobs within each batch should have an identical processing time and a common due date. Processing time of each job of a batch is determined according to the machine group as well as the batch group to which the job belongs. Major or minor setup times are required between two subsequent batches depending on batch sequence but are independent of machines. The objective of our study is to minimize the total weighted tardiness for the unrelated parallel machine scheduling. Four search heuristics are proposed to address the problem, namely (1) the earliest weighted due date, (2) the shortest weighted processing time, (3) the two-level batch scheduling heuristic, and (4) the simulated annealing method. These proposed local search heuristics are tested through computational experiments with data from dicing operations of a compound semiconductor manufacturing facility.  相似文献   

17.
This research examines the production control problems in two-station serial production systems under process queue time (PQT) constraints. In these serial production systems, all jobs must be processed at a fixed order in the upstream and downstream stations. There are multiple machines in both stations, and all machines are subject to random machine failures. In the downstream queue, the sum of waiting and processing time for each job is limited by an upper bound. This upper bound of time is called the PQT constraint. Violation of the PQT constraint causes high rework or scrap costs.  相似文献   

18.
提出了一种基于速率的增强自适应虚拟队列管理算法(EAVQ). 在该算法中引入主从拥塞尺度和期望链路利用比的概念; 以输入速率为主要拥塞尺度, 以便保留原有自适应虚拟队列管理算法(AVQ)中响应速度快、队列时延短、链路利用率高等优点. 同时, 以期望链路利用比为辅助拥塞准则, 设计了一种基于速率的期望链路利用比自适应机制, 解决了AVQ中参数设定困难、队列抗干扰能力弱, 及存在链路损失等缺点; 在改善系统动态性能的同时保证了链路容量的充分利用. 在线性化基础上给出了一般网络结构下TCP/EAVQ系统的局部稳定条件. 通过仿真验证了EAVQ的有效性.  相似文献   

19.
We propose an approximate approach for estimating the performance measures of the re-entrant line with single-job machines and batch machines based on the mean value analysis (MVA) technique. Multi-class jobs are assumed to be processed in predetermined routings, in which some processes may utilize the same machines in the re-entrant fashion. The performance measures of interest are the steady-state averages of the cycle time of each job class, the queue length of each buffer, and the throughput of the system. The system may not be modeled by a product form queueing network due to the inclusion of the batch machines and the multi-class jobs with different processing times. Thus, we present a methodology for approximately analyzing such a re-entrant line using the iterative procedures based upon the MVA and some heuristic adjustments. Numerical experiments show that the relative errors of the proposed method are within 5% as compared against the simulation results.Scope and purposeWe consider a re-entrant shop with multi-class jobs, in which jobs may visit some machines more than once at different stages of processing, as observed in the wafer fabrication process of semiconductor manufacturing. The re-entrant line also consists of both the single-job machine and the batch machine. The former refers to the ordinary machine processing one job at a time, and the latter means the machine processing several jobs together as a batch at a time. In this paper, we propose an approximation method based on the mean value analysis for estimating the mean cycle time of each class of jobs, the mean queue length of each buffer, and the throughput of the system.  相似文献   

20.
This application is motivated by a complex real-world scheduling problem found in the bottleneck workstation of the production line of an automotive safety glass manufacturing facility. The scheduling problem consists of scheduling jobs (glass parts) on a number of parallel batch processing machines (furnaces), assigning each job to a batch, and sequencing the batches on each machine. The two main objectives are to maximize the utilization of the parallel machines and to minimize the delay in the completion date of each job in relation to a required due date (specific for each job). Aside from the main objectives, the output batches should also produce a balanced workload on the parallel machines, balanced job due dates within each batch, and minimal capacity loss in the batches. The scheduling problem also considers a batch capacity constraint, sequence-dependent processing times, incompatible product families, additional resources, and machine capability. We propose a two-phase heuristic approach that combines exact methods with search heuristics. The first phase comprises a four-stage mixed-integer linear program for building the batches; the second phase is based on a Greedy Randomized Adaptive Search Procedure for sequencing the batches assigned to each machine. We conducted experiments on instances with up to 100 jobs built with real data from the manufacturing facility. The results are encouraging both in terms of computing time—5 min in average—and quality of the solutions—less than 10 % relative gap from the optimal solution in the first phase and less than 5 % in the second phase. Additional experiments were conducted on randomly generated instances of small, medium, and large size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号