共查询到20条相似文献,搜索用时 0 毫秒
1.
Vince Grolmusz 《Algorithmica》1991,6(1):479-489
We consider concurrent-write PRAMs with a large number of processors of unlimited computational power and an infinite shared memory. Our adversary chooses a small number of our processors and gives them a 0–1 input sequence (each chosen processor gets a bit, and each bit is given to one processor). The chosen processors are required to compute thePARITY of their input, while the others do not take part in the computation. Ifat most q processors are chosen andq 1/2 log logn, then we show that computing PARITY needsq steps in the worst case. On the other hand, there exists an algorithm which computes PARITY inq steps (for anyq <n) in this model, thus our result is sharp. Surprisingly, if our adversary choosesexactly q of our processors, then they can compute PARITY in [q/2] + 2 steps, and in this case we show that it needs at least [q2] steps. Our result implies that large parallel machines which are efficient when only a small number of their processors are active cannot be constructed. On the other hand, a result of Ajtai and Ben-Or [1] shows that if we haven input bits which contain at most polylogn 1's and polynomially many processors which are all allowed to work, then thePARITY can be solved inconstant time. 相似文献
2.
In this paper we describe a technique for finding efficient parallel algorithms for problems on directed graphs that involve checking the existence of certain kinds of paths in the graph. This technique provides efficient algorithms for finding dominators in flow graphs, performing interval and loop analysis on reducible flow graphs, and finding the feedback vertices of a digraph. Each of these algorithms takesO(log2
n) time using the same number of processors needed for fast matrix multiplication. All of these bounds are for an EREW PRAM. 相似文献
3.
COUPL+ is a programming environment for applications using unstructured and hybrid grids for numerical simulations. It automates parallelization by handling the partitioning of data and dependent data and maintaining halo interfaces and copy coherency. We explore some algorithms behind this package. A multi-level partitioning method is described which is effective in the presence of skewed data, solving the multi-set median-finding problem. Partitioning elements over a set of pre-partitioned nodes is explored and a novel method is suggested for reducing communication in the resulting distribution. 相似文献
4.
Efficient parallel algorithms for graph problems 总被引:1,自引:0,他引:1
We present an efficient technique for parallel manipulation of data structures that avoids memory access conflicts. That is, this technique works on the Exclusive Read/Exclusive Write (EREW) model of computation, which is the weakest shared memory, MIMD machine model. It is used in a new parallel radix sort algorithm that is optimal for keys whose values are over a small range. Using the radix sort and known results for parallel prefix on linked lists, we develop parallel algorithms that efficiently solve various computations on trees and unicycular graphs. Finally, we develop parallel algorithms for connected components, spanning trees, minimum spanning trees, and other graph problems. All of the graph algorithms achieve linear speedup for all but the sparsest graphs.Part of this work was done while the first author was at the University of Illinois, Urbana-Champaign, the second author was at Carnegie-Mellon University, and the third author was at the Hebrew University and the Courant Institute of Mathematical Sciences, New York University. A preliminary version of this work was presented at the 1986 International Conference on Parallel Processing. 相似文献
5.
We present parallel algorithms for computing all pair shortest paths in directed graphs. Our algorithm has time complexityO(f(n)/p+I(n)logn) on the PRAM usingp processors, whereI(n) is logn on the EREW PRAM, log logn on the CCRW PRAM,f(n) iso(n
3). On the randomized CRCW PRAM we are able to achieve time complexityO(n
3/p+logn) usingp processors.
A preliminary version of this paper was presented at the 4th Annual ACM Symposium on Parallel Algorithms and Architectures,
June 1992.
Support by NSF Grant CCR 90-20690 and PSC CUNY Awards #661340 and #662478. 相似文献
6.
Robust optimization is a popular method to tackle uncertain optimization problems. However, traditional robust optimization can only find a single solution in one run which is not flexible enough for decision-makers to select a satisfying solution according to their preferences. Besides, traditional robust optimization often takes a large number of Monte Carlo simulations to get a numeric solution, which is quite time-consuming. To address these problems, this paper proposes a parallel double-level multiobjective evolutionary algorithm (PDL-MOEA). In PDL-MOEA, a single-objective uncertain optimization problem is translated into a bi-objective one by conserving the expectation and the variance as two objectives, so that the algorithm can provide decision-makers with a group of solutions with different stabilities. Further, a parallel evolutionary mechanism based on message passing interface (MPI) is proposed to parallel the algorithm. The parallel mechanism adopts a double-level design, i.e., global level and sub-problem level. The global level acts as a master, which maintains the global population information. At the sub-problem level, the optimization problem is decomposed into a set of sub-problems which can be solved in parallel, thus reducing the computation time. Experimental results show that PDL-MOEA generally outperforms several state-of-the-art serial/parallel MOEAs in terms of accuracy, efficiency, and scalability. 相似文献
7.
The Nested Interactive Array Language, Nial has several constructs and primitives which express independent computations that can be executed in parallel. This paper describes these constructs and indicates how they might be implemented on a variety of architectures. A number of well-known parallel algorithms are presented in Nial and their effectiveness discussed. 相似文献
8.
Ivan D. BaevWaleed M. Meleis Alexandre Eichenberger 《Information Processing Letters》2002,83(1):27-32
We consider two general precedence-constrained scheduling problems that have wide applicability in the areas of parallel processing, high performance compiling, and digital system synthesis. These problems are intractable so it is important to be able to compute tight bounds on their solutions. A tight lower bound on makespan scheduling can be obtained by replacing precedence constraints with release and due dates, giving a problem that can be efficiently solved. We demonstrate that recursively applying this approach yields a bound that is provably tighter than other known bounds, and experimentally shown to achieve the optimal value at least 90.3% of the time over a synthetic benchmark.We compute the best known lower bound on weighted completion time scheduling by applying the recent discovery of a new algorithm for solving a related scheduling problem. Experiments show that this bound significantly outperforms the linear programming-based bound. We have therefore demonstrated that combinatorial algorithms can be a valuable alternative to linear programming for computing tight bounds on large scheduling problems. 相似文献
9.
V. Singh V. Kumar G. Agha C. Tomlinson 《International journal of parallel programming》1991,20(2):95-131
We present two new parallel algorithms QSP1 and QSP2 based on sequential quicksort for sorting data on a mesh multicomputer, and analyze their scalability using the isoefficiency metric. We show that QSP2 matches the lower bound on the isoefficiency function for mesh multicomputers, while QSP1 is fairly close to optimal. Langet al.
(1) and Schnorret al.
(2) have developed parallel sorting algorithms for the mesh architecture that have either optimal (Schnorr) or close to optimal (Lang) run-time complexity for the one-element-perprocessor case. Both QSP1 and QSP2 have better scalability than the scaled-down variants of these algorithms (for the case in which there are more elements than processors). We also analyze a different variant of Lang's sort which is as scalable as QSP2. We briefly discuss another metric called resource consumption. According to this metric, both QSP1 and QSP2 are superior to variants of Lang's sort. 相似文献
10.
An optimalO(log logn)-time CRCW-PRAM algorithm for computing all period lengths of a string is presented. Previous parallel algorithms compute the period only if it is shorter than half of the length of the string. The algorithm can be used to find all initial palindromes of a string in the same time and processor bounds. Both algorithms are the fastest possible over a general alphabet. We derive a lower bound for finding initial palindromes by modifying a known lower bound for finding the period length of a string [9]. Whenp processors are available the bounds become (n/p+log1+p/n2p).This work was partially supported by NSF Grant CCR-90-14605. D. Breslauer was partially supported by an IBM Graduate Fellowship while studying at Columbia University and by a European Research Consortium for Informatics and Mathematics postdoctoral fellowship. 相似文献
11.
Modular decomposition of graphs is a powerful tool for designing efficient algorithms for problems on graphs such as Maximum Weight Stable Set (MWS) and Maximum Weight Clique. Using this tool we obtain O(n·m) time algorithms for MWS on chair- and xbull-free graphs which considerably extend an earlier result on bull- and chair-free graphs by De Simone and Sassano (the chair is the graph with vertices a,b,c,d,e and edges ab,bc,cd,be, and the xbull is the graph with vertices a,b,c,d,e,f and edges ab,bc,cd,de,bf,cf). Moreover, our algorithm is robust in the sense that we do not have to check in advance whether the input graphs are indeed chair- and xbull-free. 相似文献
12.
13.
In this paper, we study the algorithm design aspects of three newly developed spin-wave architectures. The architectures are
capable of simultaneously transmitting multiple signals using different frequencies, and allow for concurrent read/write operations.
Using such features, we show a number of parallel and fault-tolerant routing schemes and introduce a set of generic parallel
processing techniques that can be used for design of fast algorithms on these spin-wave architectures. We also present a set
of application examples to illustrate the operation of the proposed generic parallel techniques.
相似文献
Mary M. Eshaghian-WilnerEmail: |
14.
Self-stabilizing algorithms for optimization problems can often be solved more easily using the distance-two model in which each vertex can instantly see the state information of all vertices up to distance two. This paper presents a new technique to emulate algorithms for the distance-two model on the distance-one model using the distributed scheduler with a slowdown factor of O(m) moves. Up until now the best transformer had a slowdown factor of O(n2m) moves. The technique is used to derive improved self-stabilizing algorithms for several graph domination problems. The paper also introduces a generalization of the distance-two model allowing a more space efficient transformer. 相似文献
15.
Evgenij E. Tyrtyshnikov 《Parallel Computing》1990,15(1-3):261-265
A method is proposed for converting an algorithm admitting no parallel treatment into a new algorithm, in essence, with much better parallel properties. The method is intended for tackling the so called T-algorithms, the term ensuing from first examples of such algorithms concerned in the context of Toeplitz-like matrices. Generalized T-algorithms are also considered. 相似文献
16.
In this paper we present optimal processor x time parallel algorithms for term matching and anti-unification of terms represented as trees. Term matching is the special case of unification in which one of the terms is restricted to contain no variables. It has wide applicability to logic programming, term rewriting systems and symbolic pattern matching. Anti-unification is the dual problem of unification in which one computes the most specific generalization of two terms. It has application to inductive inference and theorem proving. Our algorithms run in O(log2
N) time using N/log2
N processors on a shared-memory model of computation that prohibits simultaneous reads or writes (EREW PRAM). These algorithms are the first polylogarithmic-time EREW algorithms with a processor x time product of the same order as that of their sequential counterparts, thereby permitting optimal speed-ups using any number of processors up to N/log2
N. We also use the techniques developed in the paper to provide an N/log N-processor, O(log N)-time algorithm for a shared-memory model that allows both simultaneous reads and simultaneous writes (CRCW PRAM).Supported by NSF Grant IRI-88-09324 and NSF/DARPA Grant CCR-8908092. 相似文献
17.
A general parallel task scheduling problem is considered. A task can be processed in parallel on one of several alternative subsets of processors. The processing time of the task depends on the subset of processors assigned to the task. We first show the hardness of approximating the problem for both preemptive and nonpreemptive cases in the general setting. Next we focus on linear array network of m processors. We give an approximation algorithm of ratio O(logm) for nonpreemptive scheduling, and another algorithm of ratio 2 for preemptive scheduling. Finally, we give a nonpreemptive scheduling algorithm of ratio O(log2m) for m×m two-dimensional meshes. 相似文献
18.
19.
In the data-accumulating paradigm, inputs arrive continuously in real time, and the computation terminates when all the already received data are processed before another datum arrives. Previous research states that a constant upper bound on the running time of a successful algorithm within this paradigm exists only for particular forms of the data arrival law. This contradicts our recent conjecture that those problems that are solvable in real time are included in the class of logarithmic space-bounded computations. However, we prove that such an upper bound does exist in fact in both the parallel and sequential cases and for any polynomial arrival law, thus strengthening the mentioned conjecture. Then, we analyze an example of a non-continuous data arrival law. We find similar properties for the sorting algorithm under such a law, namely the existence of an upper bound on the running time, suggesting that such properties do not depend on the form of the arrival law. 相似文献
20.
E. Bertsch 《Information Processing Letters》2004,92(5):225-229
It is shown that suffix recognition for deterministic context-free languages can be done on a PRAM multi-processor within the upper complexity bounds of the graph reachability problem. 相似文献