期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Scheduling Independent Multiprocessor Tasks 总被引：1，自引：0，他引：1

Amoura Bampis Kenyon Manoussakis 《Algorithmica》2002,32(2):247-261

We study the problem of scheduling a set of n independent multiprocessor tasks with prespecified processor allocations on a fixed number of processors. We propose a linear time algorithm that finds a schedule of minimum makespan in the preemptive model, and a linear time approximation algorithm that finds a schedule of makespan within a factor of (1+\eps) of optimal in the non-preemptive model. We extend our results by obtaining a polynomial time approximation scheme for the parallel processors variant of the multiprocessor task model. 相似文献

2.

Bicriteria approximation algorithms for scheduling problems with communications delays

Evripidis?Bampis Email author Alexander?Kononov 《Journal of Scheduling》2005,8(4):281-294

We study the problem of simultaneously minimizing the makespan and the average weighted completion time for the precedence multiprocessor constrained scheduling problem with unit execution times and unit communication delays, known as the UET–UCT problem (Munier and König, Operations Research, 45(1), 145–148 (1997)). We propose a simple (16/9, 16/9)-approximation algorithm for the problem with an unrestricted number of machines. We improve our algorithm by adapting a technique first introduced by Aslam et al. (Proceedings of ACM-SODA, pp. 846–847, 1999) and provide a (1.745, 1.745)-approximate solution. For the considered scheduling problem, we prove the existence of a (1.445, 1.445)-approximate solution, improving the generic existence result of Aslam et al. (Proceedings of ACM-SODA, pp. 846–847, 1999). Also notice that our results for the case of an unrestricted number of processors hold for the more general scheduling problem with small communication delays (SCT problem), and for two other classical optimality criteria: maximum lateness and weighted lateness. Finally, we propose an approximation algorithm for the UET–UCT problem with a restricted number of processors.Research partially supported by the thematic network APPOL II (IST 2001-32007) of the European Union, the ACI-GRID2 project of the French government, and the MULT-APPROX project of the France-Berkeley Fund. 相似文献

3.

Linear-Time Approximation Schemes for Scheduling Malleable Parallel Tasks

Jansen Porkolab 《Algorithmica》2008,32(3):507-520

Abstract. A malleable parallel task is one whose execution time is a function of the number of (identical) processors alloted to it. We study the problem of scheduling a set of n independent malleable tasks on a fixed number of parallel processors, and propose an approximation scheme that for any fixed ε > 0 , computes in O(n) time a non-preemptive schedule of length at most (1+ε) times the optimum. 相似文献

4.

Efficient Construction of Minimum Makespan Schedules for Tasks with a Fixed Number of Distinct Execution Times

D. J. Rosenkrantz L. Yu S. S. Ravi 《Algorithmica》2001,30(1):83-100

This paper addresses scheduling problems for tasks with release and execution times. We present a number of efficient and easy to implement algorithms for constructing schedules of minimum makespan when the number of distinct task execution times is fixed. For a set of independent tasks, our algorithm in the single processor case runs in time linear in the number of tasks; with precedence constraints, our algorithm runs in time linear in the sum of the number of tasks and the size of the precedence constraints. In the multi-processor case, our algorithm constructs minimum makespan schedules for independent tasks with uniform execution times. The algorithm runs in O(n log m) time where n is the number of tasks and m is the number of processors. Received September 25, 1997; revised June 11, 1998. 相似文献

5.

Ideal preemptive schedules on two processors 总被引：2，自引：0，他引：2

E.?G.?Coffman Email author J.?Sethuraman V.?G.?Timkovsky 《Acta Informatica》2003,39(8):597-612

An ideal schedule minimizes both makespan and total flow time. It is known that the Coffman-Graham algorithm [Acta Informatica 1, 200-213, 1972] solves in polynomial time the problem of finding an ideal nonpreemptive schedule of unit-execution-time jobs with equal release dates and arbitrary precedence constraints on two identical parallel processors. This paper presents an extension of the algorithm that solves in polynomial time the preemptive counterpart of this problem. The complexity status of the preemptive problem of minimizing just the total flow time has been open.Received: 2 May 2003, J. Sethuraman: Research supported by an NSF CAREER Award DMI-0093981 and an IBM Faculty Partnership Award. 相似文献

6.

Upper and Lower Bounds on the Makespan of Schedules for Tree Dags on Linear Arrays

K. Kalpakis Y. Yesha 《Algorithmica》1999,23(2):159-179

We find, in polynomial time, a schedule for a complete binary tree directed acyclic graph (dag) with n unit execution time tasks on a linear array whose makespan is optimal within a factor of 1+o(1) . Further, given a binary tree dag T with n tasks and height h , we find, in polynomial time, a schedule for T on a linear array whose makespan is optimal within a factor of 5 + o(1) . On the other hand, we prove that explicit lower and upper bounds on the makespan of optimal schedules of binary tree dags on linear arrays differ at least by a factor of 1+ . We also find, in polynomial time, schedules for bounded tree dags with n unit execution time tasks, degree d , and height on a linear array which are optimal within a factor of 1+o(1) , this time under the assumption of links with unlimited bandwidth. Finally, we compute an improved upper bound on the makespan of an optimal schedule for a tree dag on the architecture independent model of Papadimitriou and Yannakakis [14], provided that its height is not too large. Received January 21, 1997; revised June 5, 1997. 相似文献

7.

Tighter Bounds on Preemptive Job Shop Scheduling with Two Machines

E. J. Anderson T. S. Jayram T. Kimbrel 《Computing》2001,67(1):83-90

We consider the preemptive job shop scheduling problem with two machines, with the objective to minimize the makespan. We present an algorithm that finds a schedule of length at most P _max/2 greater than the optimal schedule length, where P _max is the length of the longest job. Received June 13, 2000 相似文献

8.

Reliability versus performance for critical applications

Alain Girault Érik Saule Denis Trystram 《Journal of Parallel and Distributed Computing》2009

Applications implemented on critical systems are subject to both safety critical and real-time constraints. Classically, applications are specified as precedence task graphs that must be scheduled onto a given target multiprocessor heterogeneous architecture. We propose a new method for simultaneously optimizing two objectives: the execution time and the reliability of the schedule. The problem is decomposed into two successive steps: a spatial allocation during which the reliability is maximized (randomized algorithm), and a scheduling during which the makespan is minimized (list scheduling algorithm). It allows us to produce several trade-off solutions, among which the user can choose the solution that best fits the application’s requirements. Reliability is increased by replicating adequate tasks onto well chosen processors. Our fault model assumes that processors are fail-silent, that they are subject to transient failures, and that the occurrences of failures follow a constant parameter Poisson law. We assess and validate our method by running extensive simulations on both random graphs and actual application graphs. They show that it is competitive, in terms of makespan, compared to existing reference scheduling methods for heterogeneous processors (HEFT), while providing a better reliability. 相似文献

9.

Approximation Algorithms for Multiprocessor Scheduling under Uncertainty

Guolong Lin Rajmohan Rajaraman 《Theory of Computing Systems》2010,47(4):856-877

Motivated by applications in grid computing and project management, we study multiprocessor scheduling in scenarios where there is uncertainty in the successful execution of jobs when assigned to processors. We consider the problem of multiprocessor scheduling under uncertainty, in which we are given n unit-time jobs and m machines, a directed acyclic graph C giving the dependencies among the jobs, and for every job j and machine i, the probability p _ij of the successful completion of job j when scheduled on machine i in any given particular step. The goal of the problem is to find a schedule that minimizes the expected makespan, that is, the expected time at which all of the jobs are completed. 相似文献

10.

Two-machine flow-shop scheduling with rejection

Dvir Shabtay Nufar Gasper 《Computers & Operations Research》2012,39(5):1087-1096

We study a scheduling problem with rejection on a set of two machines in a flow-shop scheduling system. We evaluate the quality of a solution by two criteria: the first is the makespan and the second is the total rejection cost. We show that the problem of minimizing the makespan plus total rejection cost is NP-hard and for its solution we provide two different approximation algorithms, a pseudo-polynomial time optimization algorithm and a fully polynomial time approximation scheme (FPTAS). We also study the problem of finding the entire set of Pareto-optimal points (this problem is NP-hard due to the NP-hardness of the same problem variation on a single machine [20]). We show that this problem can be solved in pseudo-polynomial time. Moreover, we show how we can provide an FPTAS that, given that there exists a Pareto optimal schedule with a total rejection cost of at most R and a makespan of at most K, finds a solution with a total rejection cost of at most (1+?)R and a makespan value of at most (1+?)K. This is done by defining a set of auxiliary problems and providing an FPTAS algorithm to each one of them. 相似文献

11.

The butterfly barrier 总被引：3，自引：0，他引：3

Eugene D. Brooks III 《International journal of parallel programming》1986,15(4):295-307

We describe and algorithm for barrier synchronization that requires only read and write to shared store. The algorithm is faster than the traditionallocked counter approach for two processors and has an attractive log₂ N time scaling for largerN. The algorithm is free of hot spots and critical regions and requires a shared memory bandwidth which grows linearly withN, the number of participating processors. We verify the technique using both a real shared memory multiprocessor, for numbers of processors up to 30, and a shared memory multiprocessor simulator, for number of processors up to 256.Work performed under the auspices of the U.S. Department of Energy by the Lawrence Livermore National Laboratory under contract No. W-7405-ENG-48. 相似文献

12.

Dynamic task scheduling modeling in unstructured heterogeneous multiprocessor systems

Hamid TABATABAEE Mohammad Reza AKBARZADEH-T Naser PARIZ 《浙江大学学报:C卷英文版》2014,15(6):423-434

An algorithm is proposed for scheduling dependent tasks in time-varying heterogeneous multiprocessor systems, in which computational power and links between processors are allowed to change over time. Link contention is considered in the multiprocessor scheduling problem. A linear switching-state space-modeling paradigm is introduced to enable theoretical analysis from a system engineering perspective. Theoretical analysis of this model shows its robustness against changes in processing power and link failure. The proposed algorithm uses a fuzzy decision-making procedure to handle changes in the multiprocessor system. The efficiency of the proposed algorithm is illustrated by several random experiments and comparison against a recent benchmark approach. The results show up to 18% average improvement in makespan, especially for larger scale systems. 相似文献

13.

An Approximation Algorithm and Dynamic Programming for Reduction in Heterogeneous Environments

Pangfeng Liu May-Chen Kuo Da-Wei Wang 《Algorithmica》2009,53(3):425-453

Network of workstation (NOW) is a cost-effective alternative to massively parallel supercomputers. As commercially available off-the-shelf processors become cheaper and faster, it is now possible to build a cluster that provides high computing power within a limited budget. However, a cluster may consist of different types of processors and this heterogeneity complicates the design of efficient collective communication protocols. For example, it is a very hard combinatorial problem to find an optimal reduction schedule for such heterogeneous clusters. Nevertheless, we show that a simple technique called slowest-node-first (SNF) is very effective in designing efficient reduction protocols for heterogeneous clusters. First, we show that SNF is actually a 2-approximation algorithm, which means that an SNF schedule length is always within twice of the optimal schedule length, no matter what kind of cluster is given. In addition, we show that SNF does give the optimal reduction time when the cluster consists of two types of processors, when the ratio of communication speed between them is at least two. When the communication speed ratio is less than two, we develop a dynamic programming technique to find the optimal schedule. Our dynamic programming utilizes the monotone property of the objective function, and can significantly reduce the amount of computation time. Finally, combined with an approximation algorithm for broadcast 2004, we propose an all-reduction algorithm which sends the reduction answer to all processors, with approximation ratio 3.5. We conduct three groups of experiments. First, we show that SNF performs better than the built-in MPI_Reduce in a test cluster. Second, we observe a factor of 93 times saving in computation time to find the optimal schedule, when compared with a naive dynamic programming implementation. Thirdly, we apply the theoretical results to a branch-and-bound search and show that they can reduce the search time of the optimal reduction schedule by a factor of 500, when the cluster has three kinds of processors. 相似文献

14.

Complexity and Approximations for Multimessage Multicasting

Teofilo F. Gonzalez 《Journal of Parallel and Distributed Computing》1998,55(2):129

We consider multimessage multicasting over thenprocessor complete (or fully connected) static network (MM_C). First we present a linear time algorithm that constructs for every degreedproblem instance a communication schedule with total communication time at mostd², wheredis the maximum number of messages that each processor may send or receive. Then we present degreedproblem instances such that all their communication schedules have total communication time at leastd². We observe that our lower bound applies when the fan-out (maximum number of processors receiving any given message) is huge, and thus the number of processors is also huge. Since this environment is not likely to arise in the near future, we turn our attention to the study of important subproblems that are likely to arise in practice. We show that when each message has fan-outk=1 theMM_Cproblem corresponds to the makespan openshop preemptive scheduling problem which can be solved in polynomial time and show that fork?2 our problem is NP-complete and remains NP-complete even when forwarding is allowed. We present an algorithm to generate a communication schedule with total communication time 2d−1 for any degreedproblem instance with fan-outk=2. Our main result is anO(q·d·e) time algorithm, wheree?nd(the input length), with an approximation bound ofqd+k^1/q(d−1), for any integerqsuch thatk>q?2. Our algorithms are centralized and require all the communication information ahead of time. Applications where all of this information is readily available include iterative algorithms for solving linear equations, and most dynamic programming procedures. The Meiko CS-2 machine and computer systems with processors communicating via dynamic permutation networks whose basic switches can act as data replicators (e.g.,nbynBenes network with 2 by 2 switches that can also act as data replicators) will also benefit from our results at the expense of doubling the number of communication phases. 相似文献

15.

PROCESSOR-TIME-OPTIMAL SYSTOLIC ARRAYS

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(3-4):167-199

Abstract

Minimizing the amount of time and number of processors needed to perform an application reduces the application's fabrication cost and operation costs. A directed acyclic graph (dag) model of algorithms is used to define a time-minimal schedule and a processor-time-minimal schedule, We present a technique for finding a lower bound on the number of processors needed to achieve a given schedule of an algorithm. The application of this technique is illustrated with a tensor product computation. We then apply the technique to the free schedule of algorithms for matrix product, Gaussian elimination, and transitive closure. For each, we provide a time-minimal processor schedule that meets these processor lower bounds, including the one for tensor product. 相似文献

16.

Using a sparse promoting method in linear programming approximations to schedule parallel jobs

Stphane Chrtien Jean‐Marc Nicod Laurent Philippe Veronika Rehn‐Sonigo Lamiel Toch 《Concurrency and Computation》2015,27(14):3561-3586

In this paper, we tackle the well‐known problem of scheduling a collection of parallel jobs on a set of processors either in a cluster or in a multiprocessor computer. For the makespan objective, that is, the completion time of the last job, this problem has been shown to be NP‐hard, and several heuristics have already been proposed to minimize the execution time. In this paper, we consider both rigid and moldable jobs. Our main contribution is the introduction of a new approach to the scheduling problem, based on the recent discoveries in the field of compressed sensing. In the proposed approach, all possible positions and shapes of the jobs are encoded into a matrix, and the scheduling is performed by selecting the best columns under natural constraints. Thus, the solution to the new scheduling formulation is naturally sparse, and we may use appropriate relaxations to achieve the optimization task in the quickest possible way. Among many possible relaxation strategies, we choose to minimize the ℓ_p‐quasi‐norm for p∈(0,1). Minimization of the ℓ_p‐quasi‐norm is implemented via a successive linear programming approximation heuristic. We propose several new algorithms based on this approach, and we assess their efficiency through simulations. The experiments show that the scheme outperforms the classic Largest Task First list based algorithm for scheduling small to medium instances but needs improvements to compete on larger numbers of jobs. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

17.

Modeling Parallel Bandwidth: Local versus Global Restrictions

M. Adler P. B. Gibbons Y. Matias V. Ramachandran 《Algorithmica》1999,24(3-4):381-404

Recently there has been an increasing interest in models of parallel computation that account for the bandwidth limitations in communication networks. Some models (e.g., bsp, logp, and qsm) account for bandwidth limitations using a per-processor parameter g > 1 , such that each processor can send/receive at most h messages in g . . . h time. Other models (e.g., pram(m )) account for bandwidth limitations as an aggregate parameter m < p , such that the p processors can send at most m messages in total at each step. This paper provides the first detailed study of the algorithmic implications of modeling parallel bandwidth as a per-processor (local) limitation versus an aggregate (global) limitation. We consider a number of basic problems such as broadcasting, parity, summation, and sorting, and give several new upper and lower time bounds that demonstrate the advantage of globally limited models over locally limited models given the same aggregate bandwidth (i.e., p . . . 1/g = m ). In general, globally limited models have a possible advantage whenever there is an imbalance in the number of messages sent/received by the processors. To exploit this advantage, the processors must schedule the sending of messages so as to respect the aggregate bandwidth limit. We present a new parallel scheduling algorithm for globally limited models that enable an unknown, arbitrarily unbalanced set of messages to be sent through the limited bandwidth within a (1 + ε) factor of the optimal off-line schedule with high probability, even if the penalty for overloading the network is an exponential function of the overload. We also present a near-optimal algorithm for the case where long messages must be sent as flits in consecutive time steps, as well as for the case where new messages to be sent arrive dynamically over an infinite time line. These results consider both message passing (distributed memory) and shared memory scenarios, and improve upon the best results for the locally limited model by a factor of Θ(g) . Finally, we present results quantifying the power of concurrent reads in a globally limited bandwidth setting, including showing an Ω(p lg m/m lg p) time separation between the exclusive-read and the concurrent-read pram(m ) models, which, when m << p , greatly improves upon the separation known previously. Received June 1, 1997; revised March 10,1998. 相似文献

18.

Optimal Time-Critical Scheduling via Resource Augmentation

Phillips Stein Torng Wein 《Algorithmica》2008,32(2):163-200

Abstract. We consider two fundamental problems in dynamic scheduling: scheduling to meet deadlines in a preemptive multiprocessor setting, and scheduling to provide good response time in a number of scheduling environments. When viewed from the perspective of traditional worst-case analysis, no good on-line algorithms exist for these problems, and for some variants no good off-line algorithms exist unless P = NP . We study these problems using a relaxed notion of competitive analysis, introduced by Kalyanasundaram and Pruhs, in which the on-line algorithm is allowed more resources than the optimal off-line algorithm to which it is compared. Using this approach, we establish that several well-known on-line algorithms, that have poor performance from an absolute worst-case perspective, are optimal for the problems in question when allowed moderately more resources. For optimization of average flow time, these are the first results of any sort, for any NP -hard version of the problem, that indicate that it might be possible to design good approximation algorithms. 相似文献

19.

New Algorithms for Disk Scheduling

Andrews Bender Zhang 《Algorithmica》2008,32(2):277-301

Abstract. Processor speed and memory capacity are increasing several times faster than disk speed. This disparity suggests that disk I/ O performance could become an important bottleneck. Methods are needed for using disks more efficiently. Past analysis of disk scheduling algorithms has largely been experimental and little attempt has been made to develop algorithms with provable performance guarantees. We consider the following disk scheduling problem. Given a set of requests on a computer disk and a convex reachability function that determines how fast the disk head travels between tracks, our goal is to schedule the disk head so that it services all the requests in the shortest time possible. We present a 3/2 -approximation algorithm (with a constant additive term). For the special case in which the reachability function is linear we present an optimal polynomial-time solution. The disk scheduling problem is related to the special case of the Asymmetric Traveling Salesman Problem with the triangle inequality (ATSP-Δ ) in which all distances are either 0 or some constant α . We show how to find the optimal tour in polynomial time and describe how this gives another approximation algorithm for the disk scheduling problem. Finally we consider the on-line version of the problem in which uniformly distributed requests arrive over time. We present an algorithm related to the above ATSP-Δ . 相似文献

20.

A Pipeline Technique for Dynamic Data Transfer on a Multiprocessor Grid 总被引：1，自引：0，他引：1

Stavros Souravlas Manos Roumeliotis 《International journal of parallel programming》2004,32(5):361-388

This paper describes a pipeline technique which is used to redistribute data on a multiprocessor grid during runtime. The main purposes of the algorithm are to minimize the data transfer time, prevent congestion on the ports of the receiving processors, and minimize the number of idle processors. One of the key ideas for this algorithm is the creation of processor classes, firstly introduced by Desprez et al. [IEEE Transactions on Parallel and Distributed Systems 9(2):102 (1998).] Based on the idea of classes, we create the pipeline tasks used to organize the redistribution of data. Our experimental results show that this pipeline technique can significantly reduce the amount of time required to complete a dynamic data transfer task. 相似文献