共查询到20条相似文献,搜索用时 15 毫秒
1.
Hazem M. Bahig Sameh S. Daoud Mahmoud K. A. Khairat 《The Journal of supercomputing》2002,22(3):269-275
We consider the problem of sorting n integers when the elements are drawn from the restricted domain [1...n]. A new deterministic parallel algorithm for sorting n integers is obtained. Its running time is O(lognlog(n/logn)) using n/logn processors on EREW (exclusive read exclusive write) PRAM (parallel random access machine). Also, our algorithm was modified to become optimal when we use
processors. This algorithm belongs to class EP (Efficient, Polynomial fast). 相似文献
2.
Peter J. Varman Balakrishna R. Iyer Donald J. Haderle Stephen M. Dunn 《Parallel Computing》1990,15(1-3):165-177
An efficient parallel algorithm for merging two sorted lists is presented. The algorithm is based on a novel partitioning algorithm that splits the two lists among the processors, in a way that ensures load balance during the merge. The partitioning algorithm can itself be efficiently parallelized, allowing the solution to scale with increased numbers of processors. A shared memory multiprocessor is assumed. The time complexity for partitioning and merging is O(N/p + log N), where p is the number of processors and N is the total number of elements in the two lists. Implementation results on a twenty node Sequent Symmetry multiprocessor are also presented. 相似文献
3.
Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting)
intervals has been considered in the sequential model. In this paper we present parallel algorithms for computing maximum
cardinality matchings among pairs of disjoint intervals in interval graphs in the EREW PRAM and hypercube models. For the
general case of the problem, our algorithms compute a maximum matching in O( log
3
n) time using O(n/ log
2
n) processors on the EREW PRAM and using n processors on the hypercubes. For the case of proper interval graphs, our algorithm runs in O( log n ) time using O(n) processors if the input intervals are not given already sorted and using O(n/ log n ) processors otherwise, on the EREW PRAM. On n -processor hypercubes, our algorithm for the proper interval case takes O( log n log log n ) time for unsorted input and O( log n ) time for sorted input. Our parallel results also lead to optimal sequential algorithms for computing maximum matchings
among disjoint intervals. In addition, we present an improved parallel algorithm for maximum matching between overlapping
intervals in proper interval graphs.
Received November 20, 1995; revised September 3, 1998. 相似文献
4.
We investigate the complexity of merging sequences of small integers on the EREW PRAM. Our most surprising result is that two sorted sequences ofn bits each can be merged inO(log logn) time. More generally, we describe an algorithm to merge two sorted sequences ofn integers drawn from the set {0, ...,m?1} inO(log logn+log min{n, m}) time with an optimal time-processor product. No sublogarithmic-time merging algorithm for this model of computation was previously known. On the other hand, we show a lower bound of Ω(log min{n, m}) on the time needed to merge two sorted sequences of lengthn each with elements drawn from the set {0, ...,m?1}, implying that our merging algorithm is as fast as possible form=(logn)Ω(1). If we impose an additional stability condition requiring the elements of each input sequence to appear in the same order in the output sequence, the time complexity of the problem becomes Θ(logn), even form=2. Stable merging is thus harder than nonstable merging. 相似文献
5.
We present parallel algorithms to construct binary trees with almost optimal weighted path length. Specifically, assuming that weights are normalized (to sum up to one) and error refers to the (absolute) difference between the weighted path length of a given tree and the optimal tree with the same weights, we present anO (logn)-time andn(log lognl logn)-EREW-processor algorithm which constructs a tree with error less than 0.18, andO (k logn log*
n)-time andn-CREW-processor algorithm which produces a tree with error at most l/n
k
, and anO (k
2 logn)-time andn
2-CREW-processor algorithm which produces a tree with error at most l/n
k
. As well, we describe two sequential algorithms, anO(kn)-time algorithm which produces a tree with error at most l/n
k
, and anO(kn)-time algorithm which produces a tree with error at most
. The last two algorithms use different computation models.The first author's research was supported in part by NSERC Research Grant 3053. A part of this work was done while the second author was at the University of British Columbia. 相似文献
6.
We present a randomized EREW PRAM algorithm to find a minimum spanning forest in a weighted undirected graph. On an n -vertex graph the algorithm runs in o(( log n)
1+
ɛ
) expected time for any ɛ >0 and performs linear expected work. This is the first linear-work, polylog-time algorithm on the EREW PRAM for this problem.
This also gives parallel algorithms that perform expected linear work on two general-purpose models of parallel computation—the
QSM and the BSP. 相似文献
7.
Abstract. Planar st -graphs find applications in a number of areas. In this paper we present efficient parallel algorithms for solving
several fundamental problems on planar st -graphs. The problems we consider include all-pairs shortest paths in weighted
planar st -graphs, single-source shortest paths in weighted planar layered digraphs (which can be reduced to single-source
shortest paths in certain special planar st -graphs), and depth-first search in planar st -graphs. Our parallel shortest
path techniques exploit the specific geometric and graphic structures of planar st -graphs, and involve schemes for partitioning
planar st -graphs into subgraphs in a way that ensures that the resulting path length matrices have a monotonicity property
[1], [2]. The parallel algorithms we obtain are a considerable improvement over the previously best known solutions (when
they are applied to these st -graph problems), and are in fact relatively simple. The parallel computational models we use are the CREW PRAM and EREW
PRAM. 相似文献
8.
《International Journal of Parallel, Emergent and Distributed Systems》2012,27(1-2):57-75
Abstract Parallel Givens sequences for solving the General Linear Model (GLM) are developed and analyzed. The block updating GLM estimation problem is also considered. The solution of the GLM employs as a main computational device the Generalized QR Decomposition, where one of the two matrices is initially upper triangular. The proposed Givens sequences efficiently exploit the initial triangular structure of the matrix and special properties of the solution method. The complexity analysis of the sequences is based on a Exclusive Read-Exclusive Write (EREW) Parallel Random Access Machine (PRAM) model with limited parallelism. Furthermore, the number of operations performed by a Givens rotation is determined by the size of the vectors used in the rotation. With these assumptions one conclusion drawn is that a sequence which applies the smallest number of compound disjoint Givens rotations to solve the GLM estimation problem does not necessarily have the lowest computational complexity. The various Givens sequences and their computational complexity analyses will be useful when addressing the solution of other similar factorization problems. 相似文献
9.
We consider the following partition problem: Given a set S of n elements that is organized as k sorted subsets of size n/k each and given a parameter h with 1/k ≤ h ≤ n/k , partition S into g = O(n/(hk)) subsets D
1
, D
2
, . . . , D
g
of size Θ(hk) each, such that, for any two indices i and j with 1 ≤ i < j ≤ g , no element in D
i
is bigger than any element in D
j
. Note that with various combinations of the values of parameters h and k , several fundamental problems, such as merging, sorting, and finding an approximate median, can be formulated as or be reduced
to this partition problem. The partition problem also finds many applications in solving problems of parallel computing and
computational geometry. In this paper we present efficient parallel algorithms for solving the partition problem and a number
of its applications. Our parallel partition algorithm runs in O( log n) time using processors in the EREW PRAM model. The complexity bounds of our parallel partition algorithm on the respective special cases
match those of the optimal EREW PRAM algorithms for merging, sorting, and finding an approximate median. Using our parallel
partition algorithm, we are also able to obtain better complexity bounds (even possibly on a weaker parallel model) than the
previously best known parallel algorithms for several important problems, including parallel multiselection, parallel multiranking,
and parallel sorting of k sorted subsets.
Received May 5, 1996; revised July 30, 1998. 相似文献
10.
A planar monotone circuit (PMC) is a Boolean circuit that can be embedded in the plane and that contains only AND and OR
gates. A layered PMC is a PMC in which all input nodes are in the external face, and the gates can be assigned to layers in
such a way that every wire goes between gates in successive layers. Goldschlager, Cook and Dymond, and others have developed
NC
2
algorithms to evaluate a layered PMC when the output node is in the same face as the input nodes. These algorithms require
a large number of processors (Ω(n
6
), where n is the size of the input circuit).
In this paper we give an efficient parallel algorithm that evaluates a layered PMC of size n in time using only a linear number of processors on an EREW PRAM. Our parallel algorithm is the best possible to within a polylog
factor, and is a substantial improvement over the earlier algorithms for the problem.
Received April 18, 1994; revised April 7, 1995. 相似文献
11.
The vertex updating problem for a minimum spanning tree (MST) is defined as follows: Given a graphG=(V, E
G) and an MSTT forG, find a new MST forG to which a new vertexz has been added along with weighted edges that connectz with the vertices ofG. We present a set of rules that produce simple optimal parallel algorithms that run inO(lgn) time usingn/lgn EREW PRAM processors, wheren=¦V¦. These algorithms employ any valid tree-contraction schedule that can be produced within the stated resource bounds. These rules can also be used to derive simple linear-time sequential algorithms for the same problem. The previously best-known parallel result was a rather complicated algorithm that usedn processors in the more powerful CREW PRAM model. Furthermore, we show how our solution can be used to solve the multiple vertex updating problem: Update a given MST whenk new vertices are introduced simultaneously. This problem is solved inO(lgk·lgn) parallel time using (k·n)/(lgk·lgn) EREW PRAM processors. This is optimal for graphs having (kn) edges.Part of this work was done while P. Metaxas was with the Department of Mathematics and Computer Science, Dartmouth College. 相似文献
12.
In this paper we give efficient parallel algorithms for solving a number of visibility and shortest-path problems for simple polygons. Our algorithms all run inO(logn) time and are based on the use of a new data structure for implicitly representing all shortest paths in a simple polygonP, which we call thestratified decomposition tree. We use this approach to derive efficient parallel methods for computing the visibility ofP from an edge, constructing the visibility graph of the vertices ofP (using an output-sensitive number of processors), constructing the shortest-path tree from a vertex ofP, and determining all-farthest neighbors for the vertices inP. The computational model we use is the CREW PRAM.This research was announced in preliminary form in theProceedings of the 6th ACM Symposium on Computational Geometry, 1990, pp. 73–82. The research of Michael T. Goodrich was supported by the National Science Foundation under Grants CCR-8810568 and CCR-9003299, and by the NSF and DARPA under Grant CCR-8908092. 相似文献
13.
14.
Binary images can be compressed efficiently using context‐based statistical modeling and arithmetic coding. However, this approach is fully sequential and therefore additional computing power from parallel computers cannot be utilized. We attack this problem and show how to implement the context‐based compression in parallel. Our approach is to segment the image into non‐overlapping blocks, which are compressed independently by the processors. We give two alternative solutions about how to construct, distribute and utilize the model in parallel, and study the effect on the compression performance and execution time. We show by experiments that the proposed approach achieves speedup that is proportional to the number of processors. The work efficiency exceeds 50% with any reasonable number of processors. Copyright © 2002 John Wiley & Sons, Ltd. 相似文献
15.
We consider the problem of generating random permutations with uniform distribution. That is, we require that for an arbitrary
permutation π of n elements, with probability 1/n! the machine halts with the i th output cell containing π(i) , for 1 ≤ i ≤ n . We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM.
The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n
1+o(1)
) processors on the CREW PRAM. This is the first o(log n) -time CREW PRAM algorithm for this problem.
On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm outperforms each of the previously known algorithms for the exclusive write PRAMs.
The common and novel feature of both our algorithms is first to design a suitable random switching network generating a permutation
and then to simulate this network on the PRAM model in a fast way.
Received November 1996; revised March 1997. 相似文献
16.
Jing-Chao Chen 《Information Processing Letters》2006,98(1):34-40
The early algorithms for in-place merging were mainly focused on the time complexity, whereas their structures themselves were ignored. Most of them therefore are elusive and of only theoretical significance. For this reason, the paper simplifies the unstable in-place merge by Geffert et al. [V. Geffert, J. Katajainen, T. Pasanen, Asymptotically efficient in-place merging, Theoret. Comput. Sci. 237 (2000) 159-181]. The simplified algorithm is simple yet practical, and has a small time complexity. 相似文献
17.
18.
The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log
*
( log
*
n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM).
Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found
in O(f(n)) time with the same number of processors, on the same model.
Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking),
2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM.
These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property
of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) .
Received May 20, 1997; revised June 15, 1998. 相似文献
19.
20.
Parallel machine flexible resource scheduling (PMFRS) problems consider an additional flexible resource (e.g. operators), which can be freely allocated to any jobs and/or any machines and may speed-up the process in proportion to its amount. If job–machine assignment is unspecified, the problem is referred to as unspecified PMFRS (UPMFRS). This paper reviews the mathematical models of both PMFRS and UPMFRS problems in the literature and not only gives some extensions to the model of dynamic PMFRS problem but also presents integer programming (IP) models for static and dynamic UPMFRS problems with the objective of minimizing makespan. To solve large-sized dynamic PMFRS and UPMFRS problems, a relaxed IP based constraint programming (CP) approach is also proposed. All IP models and the proposed IP/CP approach are tested with an extensive computational study. The results of the computational experiments are discussed with respect to the major parameters of the problem and conclusions are drawn. 相似文献