期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Parallel integer sorting and simulation amongst CRCW models 总被引：1，自引：0，他引：1

Sanjeev Saxena 《Acta Informatica》1996,33(7):607-619

In this paper a general technique for reducing processors in simulation without any increase in time is described. This results in an O(√log n) time algorithm for simulating one step of PRIORITY on TOLERANT with processor-time product of O(n log log n); the same as that for simulating PRIORITY on ARBITRARY. This is used to obtain an O(log n/log log n+√log n (log log m− log log n)) time algorithm for sorting n integers from the set {0,…, m−1}, m≧n, with a processor-time product of O(n log log m log log n) on a TOLERANT CRCW PRAM. New upper and lower bounds for ordered chaining problem on an allocated COMMON CRCW model are also obtained. The algorithm for ordered chaining takes O(log n/log log n) time on an allocated PRAM of size n. It is shown that this result is best possible (upto a constant multiplicative factor) by obtaining a lower bound of Ω(r log n/(log r+log log n)) for finding the first (leftmost one) live processor on an allocated-COMMON PRAM of size n of r-slow virtual processors (one processor simulates r processors of allocated PRAM). As a result, for ordered chaining problem, “processor-time product” has to be at least Ω(n log n/log log n) for any poly-logarithmic time algorithm. Algorithm for ordered-chaining problem results in an O(log N/log log N) time algorithm for (stable) sorting of n integers from the set {0,…, m−1} with n-processors on a COMMON CRCW PRAM; here N=max(n, m). In particular if, m=n ^O(1), then sorting takes Θ(log n/log log n) time on both TOLERANT and COMMON CRCW PRAMs. Processor-time product for TOLERANT is O(n(log log n)²). Algorithm for COMMON uses n processors. Received August 13, 1992/June 30, 1995 相似文献

2.

Expected parallel time and sequential space complexity of graph and digraph problems

John Reif Paul Spirakis 《Algorithmica》1992,7(1-6):597-630

This paper determines upper bounds on the expected time complexity for a variety of parallel algorithms for undirected and directed random graph problems. For connectivity, biconnectivity, transitive closure, minimum spanning trees, and all pairs minimum cost paths, we prove the expected time to beO(log logn) for the CRCW PRAM (this parallel RAM machine allows resolution of write conflicts) andO(logn · log logn) for the CREW PRAM (which allows simultaneous reads but not simultaneous writes). We also show that the problem of graph isomorphism has expected parallel timeO(log logn) for the CRCW PRAM andO(logn) for the CREW PRAM. Most of these results follow because of upper bounds on the mean depth of a graph, derived in this paper, for more general graphs than was known before. For undirected connectivity especially, we present a new probabilistic algorithm which runs on a randomized input and has an expected running time ofO(log logn) on the CRCW PRAM, withO(n) expected number of processors only. Our results also improve known upper bounds on the expected space required for sequential graph algorithms. For example, we show that the problems of finding connected components, transitive closure, minimum spanning trees, and minimum cost paths have expected sequential spaceO(logn · log logn) on a deterministic Turing Machine. We use a simulation of the CRCW PRAM to get these expected sequential space bounds. 相似文献

3.

Simulations among concurrent-write PRAMs 总被引：1，自引：0，他引：1

Faith E. Fich Prabhakar Ragde Avi Wigderson 《Algorithmica》1988,3(1):43-51

This paper is concerned with the relative power of the two most popular concurrent-write models of parallel computation, the PRIORITY PRAM [G], and the COMMON PRAM [K]. Improving the trivial and seemingly optimalO(logn) simulation, we show that one step of a PRIORITY machine can be simulated byO(logn/(log logn)) steps of a COMMON machine with the same number of processors (and more memory). We further prove that this is optimal, if processor communication is restricted in a natural way.Support for this research was provided by NSF Grants MCS-8402676 and MCS-8120790, DARPA Contract No. N00039-84-C-0089, an IBM Faculty Development Award, and an NSERC postgraduate scholarship. 相似文献

4.

Optimal Sublogarithmic Time Parallel Algorithms on Rooted Forests

G. Sajith S. Saxena 《Algorithmica》2000,27(2):187-197

The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log^* ( log^* n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM). Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O(f(n)) time with the same number of processors, on the same model. Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM. These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) . Received May 20, 1997; revised June 15, 1998. 相似文献

5.

Simulations among concurrent-write PRAMs

Faith E. Fich Prabhakar Ragde Avi Wigderson 《Algorithmica》1988,3(1-4):43-51

This paper is concerned with the relative power of the two most popular concurrent-write models of parallel computation, the PRIORITY PRAM [G], and the COMMON PRAM [K]. Improving the trivial and seemingly optimalO(logn) simulation, we show that one step of a PRIORITY machine can be simulated byO(logn/(log logn)) steps of a COMMON machine with the same number of processors (and more memory). We further prove that this is optimal, if processor communication is restricted in a natural way. 相似文献

6.

On Parallel Selection and Searching in Partial Orders: Sorted Matrices

R. Sarnath Xin He 《Journal of Parallel and Distributed Computing》1997,40(2):1051

Parallel algorithms for the problems of selection and searching on sorted matrices are formulated. The selection algorithm takesO(lognlog lognlog*n) time withO(n/lognlog*n) processors on an EREW PRAM. This algorithm can be generalized to solve the selection problem on a set of sorted matrices. The searching algorithm takesO(log logn) time withO(n/log logn) processors on a Common CRCW PRAM, which is optimal. We show that no algorithm using at mostnlog^cnprocessors,c≥ 1, can solve the matrix search problem in time faster than Ω(log logn) and that Ω(logn) steps are needed to solve this problem on any model that does not allow concurrent writes. 相似文献

7.

Expected parallel time and sequential space complexity of graph and digraph problems

John Reif Paul Spirakis 《Algorithmica》1992,7(1):597-630

This paper determines upper bounds on the expected time complexity for a variety of parallel algorithms for undirected and directed random graph problems. For connectivity, biconnectivity, transitive closure, minimum spanning trees, and all pairs minimum cost paths, we prove the expected time to beO(log logn) for the CRCW PRAM (this parallel RAM machine allows resolution of write conflicts) andO(logn · log logn) for the CREW PRAM (which allows simultaneous reads but not simultaneous writes). We also show that the problem of graph isomorphism has expected parallel timeO(log logn) for the CRCW PRAM andO(logn) for the CREW PRAM. Most of these results follow because of upper bounds on the mean depth of a graph, derived in this paper, for more general graphs than was known before.For undirected connectivity especially, we present a new probabilistic algorithm which runs on a randomized input and has an expected running time ofO(log logn) on the CRCW PRAM, withO(n) expected number of processors only.Our results also improve known upper bounds on the expected space required for sequential graph algorithms. For example, we show that the problems of finding connected components, transitive closure, minimum spanning trees, and minimum cost paths have expected sequential spaceO(logn · log logn) on a deterministic Turing Machine. We use a simulation of the CRCW PRAM to get these expected sequential space bounds.This research was supported by National Science Foundation Grant DCR-85-03251 and Office of Naval Research Contract N00014-80-C-0647.This research was partially supported by the National Science Foundation Grants MCS-83-00630, DCR-8503497, by the Greek Ministry of Research and Technology, and by the ESPRIT Basic Research Actions Project ALCOM. 相似文献

8.

Efficient parallel algorithms forr-dominating set andp-center problems on trees

Xin He Yaacov Yesha 《Algorithmica》1990,5(1-4):129-145

We develop efficient parallel algorithms for ther-dominating set and thep-center problems on trees. On a concurrent-read exclusive-write PRAM, our algorithm for ther-dominating set problem runs inO(logn log logn) time withn processors. The algorithm for thep-center problem runs inO(log² n log logn) time withn processors. 相似文献

9.

Finding least-weight subsequences with fewer processors

Tak Wah Lam Kwong-fai Chan 《Algorithmica》1993,9(6):615-628

By restricting weight functions to satisfy the quadrangle inequality or the inverse quadrangle inequality, significant progress has been made in developing efficient sequential algorithms for the least-weight subsequence problem [10], [9], [12], [16]. However, not much is known on the improvement of the naive parallel algorithm for the problem, which is fast but demands too many processors (i.e., it takesO(log² n) time on a CREW PRAM with n³/logn processors). In this paper we show that if the weight function satisfies the inverse quadrangle inequality, the problem can be solved on a CREW PRAM in O(log² n log logn) time withn/log logn processors, or in O(log² n) time withn logn processors. Notice that the processor-time complexity of our algorithm is much closer to the almost linear-time complexity of the best-known sequential algorithm [12]. 相似文献

10.

Efficient parallel algorithms forr-dominating set andp-center problems on trees

Xin He Yaacov Yesha 《Algorithmica》1990,5(1):129-145

We develop efficient parallel algorithms for ther-dominating set and thep-center problems on trees. On a concurrent-read exclusive-write PRAM, our algorithm for ther-dominating set problem runs inO(logn log logn) time withn processors. The algorithm for thep-center problem runs inO(log² n log logn) time withn processors.Xin He was supported in part by an Ohio State University Presidential Fellowship, and by the Office of Research and Graduate Studies of Ohio State University. Yaacov Yesha was supported in part by the National Science Foundation under Grant No. DCR-8606366. 相似文献

11.

Efficient Graph-Theoretic Algorithms on a Linear Array with a Reconfigurable Pipelined Bus System 总被引：1，自引：0，他引：1

Amitava Datta 《The Journal of supercomputing》2002,23(2):193-211

We present efficient algorithms for solving several fundamental graph-theoretic problems on a Linear Array with a Reconfigurable Pipelined Bus System (LARPBS), one of the recently proposed models of computation based on optical buses. Our algorithms include finding connected components, minimum spanning forest, biconnected components, bridges and articulation points for an undirected graph. We compute the connected components and minimum spanning forest of a graph in O(log n) time using O(m+n) processors where m and n are the number of edges and vertices in the graph and m=O(n ²) for a dense graph. Both the processor and time complexities of these two algorithms match the complexities of algorithms on the Arbitrary and Priority CRCW PRAM models which are two of the strongest PRAM models. The algorithms for these two problems published by Li et al. [7] have been considered to be the most efficient on the LARPBS model till now. Their algorithm [7] for these two problems require O(log n) time and O(n ³/log n) processors. Hence, our algorithms have the same time complexity but require less processors. Our algorithms for computing biconnected components, bridges and articulation points of a graph run in O(log n) time on an LARPBS with O(n ²) processors. No previous algorithm was known for these latter problems on the LARPBS. 相似文献

12.

Parallel Algorithms for the Edge-Coloring and Edge-Coloring Update Problems

《Journal of Parallel and Distributed Computing》1996,32(1):66-73

LetG(V,E) be a simple undirected graph with a maximum vertex degree Δ(G) (or Δ for short), |V| =nand |E| =m. An edge-coloring ofGis an assignment to each edge inGa color such that all edges sharing a common vertex have different colors. The minimum number of colors needed is denoted by χ′(G) (called thechromatic index). For a simple graphG, it is known that Δ ≤ χ′(G) ≤ Δ + 1. This paper studies two edge-coloring problems. The first problem is to perform edge-coloring for an existing edge-colored graphGwith Δ + 1 colors stemming from the addition of a new vertex intoG. The proposed parallel algorithm for this problem runs inO(Δ^3/2log³Δ + Δ logn) time usingO(max{nΔ, Δ³}) processors. The second problem is to color the edges of a given uncolored graphGwith Δ + 1 colors. For this problem, our first parallel algorithm requiresO(Δ^5.5log³Δ logn+ Δ⁵log⁴n) time andO(max{n²Δ,nΔ³}) processors, which is a slight improvement on the algorithm by H. J. Karloff and D. B. Shmoys [J. Algorithms8 (1987), 39–52]. Their algorithm costsO(Δ⁶log⁴n) time andO(n²Δ) processors if we use the fastest known algorithm for finding maximal independent sets by M. Goldberg and T. Spencer [SIAM J. Discrete Math.2 (1989), 322–328]. Our second algorithm requiresO(Δ^4.5log³Δ logn+ Δ⁴log⁴n) time andO(max{n²,nΔ³}) processors. Finally, we present our third algorithm by incorporating the second algorithm as a subroutine. This algorithm requiresO(Δ^3.5log³Δ logn+ Δ³log⁴n) time andO(max{n²log Δ,nΔ³}) processors, which improves, by anO(Δ^2.5) factor in time, on Karloff and Shmoys' algorithm. All of these algorithms run in the COMMON CRCW PRAM model. 相似文献

13.

More Efficient Topological Sort Using Reconfigurable Optical Buses

Li Jie Pan Yi Shen Hong 《The Journal of supercomputing》2003,24(3):251-258

Topological sort of an acyclic graph has many applications such as job scheduling and network analysis. Due to its importance, it has been tackled on many models. Dekel et al. [3], proposed an algorithm for solving the problem in O(log² N) time on the hypercube or shuffle-exchange networks with O(N ³) processors. Chaudhuri [2], gave an O(log N) algorithm using O(N ³) processors on a CRCW PRAM model. On the LARPBS (Linear Arrays with a Reconfigurable Pipelined Bus System) model, Li et al. [5] showed that the problem for a weighted directed graph with N vertices can be solved in O(log N) time by using N ³ processors. In this paper, a more efficient topological sort algorithm is proposed on the same LARPBS model. We show that the problem can be solved in O(log N) time by using N ³/log N processors. We show that the algorithm has better time and processor complexities than the best algorithm on the hypercube, and has the same time complexity but better processor complexity than the best algorithm on the CRCW PRAM model. 相似文献

14.

An improved parallel algorithm for integer GCD

Benny Chor Oded Goldreich 《Algorithmica》1990,5(1-4):1-10

We present a simple parallel algorithm for computing the greatest common divisor (gcd) of twon-bit integers in the Common version of the CRCW model of computation. The run-time of the algorithm in terms of bit operations isO(n/logn), usingn ^1+? processors, where ? is any positive constant. This improves on the algorithm of Kannan, Miller, and Rudolph, the only sublinear algorithm known previously, both in run time and in number of processors; they requireO(n log logn/logn),n ² log² n, respectively, in the same CRCW model. We give an alternative implementation of our algorithm in the CREW model. Its run-time isO(n log logn/logn), usingn ^1+? processors. Both implementations can be modified to yield the extended gcd, within the same complexity bounds. 相似文献

15.

Optimal Computing the Chessboard Distance Transform on Parallel Processing Systems

Yu-Hua Lee Shi-Jinn Horng 《Computer Vision and Image Understanding》1999,73(3):272

Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N²) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications. 相似文献

16.

Reconstructing a Binary Tree from Its Traversals in Doubly Logarithmic CREW Time

《Journal of Parallel and Distributed Computing》1995,27(1):100-105

We consider the following problem. For a binary tree T = (V, E) where V = {1, 2, ..., n}, given its inorder traversal and either its preorder or its postorder traversal, reconstruct the binary tree. We present a new parallel algorithm for this problem. Our algorithm requires O(n) space. The main idea of our algorithm is to reduce the reconstruction process to merging two sorted sequences. With the best parallel merging algorithms, our algorithm can be implemented in O(log log n) time using O(n/log log n) processors on the CREW PRAM (or in O(log n) time using O(n/log n) processors on the EREW PRAM). Our result provides one more example of a fundamental problem which can be solved by optimal parallel algorithms in O(log log n)time on the CREW PRAM. 相似文献

17.

Optimal parallel detection of squares in strings

Alberto Apostolico 《Algorithmica》1992,8(1-6):285-319

A stringw isprimitive if it is not a power of another string (i.e., writingw =v ^k impliesk = 1. Conversely,w is asquare ifw =vv, withv a primitive string. A stringx issquare-free if it has no nonempty substring of the formww. It is shown that the square-freedom of a string ofn symbols over an arbitrary alphabet can be tested by a CRCW PRAM withn processors inO(logn) time and linear auxiliary space. If the cardinality of the input alphabet is bounded by a constant independent of the input size, then the number of processors can be reduced ton/logn without affecting the time complexity of this strategy. The fastest sequential algorithms solve this problemO(n logn) orO(n) time, depending on whether the cardinality of the input alphabet is unbounded or bounded, and either performance is known to be optimal within its class. More elaborate constructions lead to a CRCW PRAM algorithm for detecting, within the samen-processors bounds, all positioned squares inx in timeO(logn) and using linear auxiliary space. The fastest sequential algorithms solve this problem inO(n logn) time, and such a performance is known to be optimal. 相似文献

18.

An Optimal Shortest Path Parallel Algorithm for Permutation Graphs

《Journal of Parallel and Distributed Computing》1995,24(1):94-99

We present an optimal parallel algorithm for the single-source shortest path problem for permutation graphs. The algorithm runs in O(log n) time using O(n/log n) processors on an EREW PRAM. As an application, we show that a minimum connected dominating set in a permutation graph can be found in O(log n) time using O(n/log n) processors. 相似文献

19.

Efficient parallel algorithms for computing all pair shortest paths in directed graphs

Yijie Han V. Y. Pan J. H. Reif 《Algorithmica》1997,17(4):399-415

We present parallel algorithms for computing all pair shortest paths in directed graphs. Our algorithm has time complexityO(f(n)/p+I(n)logn) on the PRAM usingp processors, whereI(n) is logn on the EREW PRAM, log logn on the CCRW PRAM,f(n) iso(n ³). On the randomized CRCW PRAM we are able to achieve time complexityO(n ³/p+logn) usingp processors. A preliminary version of this paper was presented at the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, June 1992. Support by NSF Grant CCR 90-20690 and PSC CUNY Awards #661340 and #662478. 相似文献

20.

A Parallel Priority Queue with Constant Time Operations

《Journal of Parallel and Distributed Computing》1998,49(1):4-21

We present a parallel priority queue that supports the following operations in constant time:parallel insertionof a sequence of elements ordered according to key,parallel decrease keyfor a sequence of elements ordered according to key,deletion of the minimum key element, anddeletion of an arbitrary element. Our data structure is the first to support multi-insertion and multi-decrease key in constant time. The priority queue can be implemented on the EREW PRAM and can perform any sequence ofnoperations inO(n) time andO(mlogn) work,mbeing the total number of keyes inserted and/or updated. A main application is a parallel implementation of Dijkstra's algorithm for the single-source shortest path problem, which runs inO(n) time andO(mlogn) work on a CREW PRAM on graphs withnvertices andmedges. This is a logarithmic factor improvement in the running time compared with previous approaches. 相似文献