期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

<Emphasis Type="Italic">O</Emphasis>(log*<Emphasis Type="Italic">n</Emphasis>) algorithms on a Sum-CRCW PRAM

S. C. Eisenstat 《Computing》2007,79(1):93-97

We present work- and cost-optimal O(log*n) algorithms for prefix sums and linear integer sorting on a Sum-CRCW PRAM. 相似文献

2.

On Parallel Selection and Searching in Partial Orders: Sorted Matrices

R. Sarnath Xin He 《Journal of Parallel and Distributed Computing》1997,40(2):1051

Parallel algorithms for the problems of selection and searching on sorted matrices are formulated. The selection algorithm takesO(lognlog lognlog*n) time withO(n/lognlog*n) processors on an EREW PRAM. This algorithm can be generalized to solve the selection problem on a set of sorted matrices. The searching algorithm takesO(log logn) time withO(n/log logn) processors on a Common CRCW PRAM, which is optimal. We show that no algorithm using at mostnlog^cnprocessors,c≥ 1, can solve the matrix search problem in time faster than Ω(log logn) and that Ω(logn) steps are needed to solve this problem on any model that does not allow concurrent writes. 相似文献

3.

An optimal parallel algorithm for computing cut vertices and blocks on interval graphs

《国际计算机数学杂志》2012,89(1):59-70

In this paper, a parallel algorithm is presented to find all cut-vertices and blocks of an interval graph. If the list of sorted end points of the intervals of an interval graph is given then the proposed algorithm takes O(log n) time and O(n/log n) processors on an EREW PRAM, if the sorted list is not given then the time and processors complexities are respectively O(log n) and O(n). 相似文献

4.

Deterministic parallel list ranking

Richard J. Anderson Gary L. Miller 《Algorithmica》1991,6(1):859-868

In this paper we describe a simple parallel algorithm for list ranking. The algorithm is deterministic and runs inO(logn) time on an EREW PRAM withn/logn processors. The algorithm matches the performance of the Cole-Vishkin [CV3] algorithm but is simple and has reasonable constant factors.R. J. Anderson was supported by an NSF Presidential Young Investigator award and G. L. Miller was supported by NSF Grant DCR-85114961. 相似文献

5.

On Parallel Integer Merging

Berkman O. Vishkin U. 《Information and Computation》1993,106(2)

The problem of merging two sorted arrays A = (a₁, a₂, ..., a_n₁) and B = (b₁, b₂, ..., b_n₂) is considered. For input elements that are drawn from a domain of integers [1...s] we present an algorithm that runs in O(log log log s) time using n/log log log s CREW PRAM processors (optimal speed-up) and O(ns^ε) space, where n = n₁ + n₂. For input elements that are drawn from a domain of integers [1...n] we present a second algorithm that runs in O(α(n)) time (where α(n) is the inverse of Ackermann′s function) using n/α(n) CREW PRAM processors and linear space. This second algorithm is non-uniform; however, it can be made uniform at a price of a certain loss of speed, or by using a CRCW PRAM. 相似文献

6.

Expected parallel time and sequential space complexity of graph and digraph problems

John Reif Paul Spirakis 《Algorithmica》1992,7(1):597-630

This paper determines upper bounds on the expected time complexity for a variety of parallel algorithms for undirected and directed random graph problems. For connectivity, biconnectivity, transitive closure, minimum spanning trees, and all pairs minimum cost paths, we prove the expected time to beO(log logn) for the CRCW PRAM (this parallel RAM machine allows resolution of write conflicts) andO(logn · log logn) for the CREW PRAM (which allows simultaneous reads but not simultaneous writes). We also show that the problem of graph isomorphism has expected parallel timeO(log logn) for the CRCW PRAM andO(logn) for the CREW PRAM. Most of these results follow because of upper bounds on the mean depth of a graph, derived in this paper, for more general graphs than was known before.For undirected connectivity especially, we present a new probabilistic algorithm which runs on a randomized input and has an expected running time ofO(log logn) on the CRCW PRAM, withO(n) expected number of processors only.Our results also improve known upper bounds on the expected space required for sequential graph algorithms. For example, we show that the problems of finding connected components, transitive closure, minimum spanning trees, and minimum cost paths have expected sequential spaceO(logn · log logn) on a deterministic Turing Machine. We use a simulation of the CRCW PRAM to get these expected sequential space bounds.This research was supported by National Science Foundation Grant DCR-85-03251 and Office of Naval Research Contract N00014-80-C-0647.This research was partially supported by the National Science Foundation Grants MCS-83-00630, DCR-8503497, by the Greek Ministry of Research and Technology, and by the ESPRIT Basic Research Actions Project ALCOM. 相似文献

7.

Selection, Routing, and Sorting on the Star Graph

Sanguthevar Rajasekaran David S.L. Wei 《Journal of Parallel and Distributed Computing》1997,41(2):1771

We consider the problems of selection, routing, and sorting on ann-star graph (withn! nodes), an interconnection network which has been proven to possess many special properties. We identify a tree like subgraph (which we call a “(k, 1,k) chain network”) of the star graph which enables us to design efficient algorithms for the above mentioned problems. We present an algorithm that performs a sequence ofnprefix computations inO(n²) time. This algorithm is used as a subroutine in our other algorithms. We also show that sorting can be performed on then-star graph in timeO(n³) and that selection of a set of uniformly distributednkeys can be performed inO(n²) time with high probability. Finally, we also present a deterministic (nonoblivious) routing algorithm that realizes any permutation inO(n³) steps on then-star graph. There exists an algorithm in the literature that can perform a single prefix computation inO(nlgn) time. The best-known previous algorithm for sorting has a run time ofO(n³lgn) and is deterministic. To our knowledge, the problem of selection has not been considered before on the star graph. 相似文献

8.

Efficient parallel recognition of some circular arc graphs, II

L. Chen 《Algorithmica》1997,17(3):266-280

Based on Tucker's work, we present an accurate proof of the characterization of proper circular arc graphs and obtain the first efficient parallel algorithm which not only recognizes proper circular arc graphs but also constructs proper circular arc representations. The algorithm runs inO(log² n) time withO(n ³) processors on a Common CRCW PRAM. The sequential algorithm can be implemented to run inO(n ²) time and is optimal if the input graph is given as an adjacency matrix, so to speak. Portions of this paper appear in preliminary form in theProceedings of the 1989Workshop on Algorithms and Data Structures [2], and theProceedings of the 1994International Symposium on Algorithms and Computation [5]. 相似文献

9.

On the Construction of Classes of Suffix Trees for Square Matrices: Algorithms and Applications

Raffaele Giancarlo Roberto Grossi 《Information and Computation》1996,130(2):151

We provide a uniform framework for the study of index data structures for a two-dimensional matrixTEXT[1:n, 1:n] whose entries are drawn from an ordered alphabetΣ. An index forTEXTcan be informally seen as the two-dimensional analog of the suffix tree for a string. It allows on-line searches and statistics to be performed onTEXTby representing compactly theΘ(n³) square submatrices ofTEXTin optimalO(n²) space. We identify 4ⁿ⁻¹families of indices forTEXT, each containing ∏ⁿ_i=1 (2i−1)! isomorphic data structures. We also develop techniques leading to a single algorithm that efficiently builds any index in any family inO(n² log n) time andO(n²) space. Such an algorithm improves in various respects the algorithms for the construction of the PAT tree and the Lsuffix tree. The framework and the algorithm easily generalize tod>2 dimensions. Moreover, as part of our algorithm, we provide new algorithmic tools that yield a space-efficient implementation of the “naming scheme” of R. Karpet al.(in“Proceedings, Fourth Symposium on Theory of Computing,” pp. 125–136) for strings and matrices. 相似文献

10.

An optimal and processor efficient parallel sorting algorithm on a linear array with a reconfigurable pipelined bus system

Min He Xiaolong Wu Si Qing Zheng 《Computers & Electrical Engineering》2009,35(6):951-965

Optical interconnections attract many engineers and scientists’ attention due to their potential for gigahertz transfer rates and concurrent access to the bus in a pipelined fashion. These unique characteristics of optical interconnections give us the opportunity to reconsider traditional algorithms designed for ideal parallel computing models, such as PRAMs. Since the PRAM model is far from practice, not all algorithms designed on this model can be implemented on a realistic parallel computing system. From this point of view, we study Cole’s pipelined merge sort [Cole R. Parallel merge sort. SIAM J Comput 1988;14:770–85] on the CREW PRAM and extend it in an innovative way to an optical interconnection model, the LARPBS (Linear Array with Reconfigurable Pipelined Bus System) model [Pan Y, Li K. Linear array with a reconfigurable pipelined bus system—concepts and applications. J Inform Sci 1998;106;237–58]. Although Cole’s algorithm is optimal, communication details have not been provided due to the fact that it is designed for a PRAM. We close this gap in our sorting algorithm on the LARPBS model and obtain an O(log N)-time optimal sorting algorithm using O(N) processors. This is a substantial improvement over the previous best sorting algorithm on the LARPBS model that runs in O(log N log log N) worst-case time using N processors [Datta A, Soundaralakshmi S, Owens R. Fast sorting algorithms on a linear array with a reconfigurable pipelined bus system. IEEE Trans Parallel Distribut Syst 2002;13(3):212–22]. Our solution allows efficiently assign and reuse processors. We also discover two new properties of Cole’s sorting algorithm that are presented as lemmas in this paper. 相似文献

11.

Sensitive Functions and Approximate Problems

Shiva Chaudhuri 《Information and Computation》1996,126(2):161

Properties of functions that are good measures of the CRCW PRAM complexity of computing them are investigated. While theblock sensitivityis known to be a good measure of the CREW PRAM complexity, no such measure is known for CRCW PRAMs. It is shown that the complexity of computing a function is related to itseverywhere sensitivity, introduced by Vishkin and Wigderson. Specifically, the time required to compute a functionf: Dⁿ→Rof everywhere sensitivityes(f) withPprocessors and unbounded memory isΩ(log[log es(f)/(log(|D|+4P/es(f)))]). This improves results of Azar and of Vishkin and Wigderson. This lower bound is used to derive new lower bounds for someapproximate problems. These problems can often be solved faster than their exact counterparts and for many applications, it is sufficient to solve the approximate problem. It is shown thatapproximate selection,approximate counting,approximate compaction, andpadded sortingall require timeΩ(log log n) with a linear number of processors, if the level of accuracy desired is moderately high. For these levels of accuracy, no lower bounds were known for these problems on the PRAM model. The lower bounds for some of the problems are tight. 相似文献

12.

Optimal Sublogarithmic Time Parallel Algorithms on Rooted Forests

G. Sajith S. Saxena 《Algorithmica》2000,27(2):187-197

The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log^* ( log^* n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM). Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O(f(n)) time with the same number of processors, on the same model. Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM. These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) . Received May 20, 1997; revised June 15, 1998. 相似文献

13.

A chained-matrices approach for parallel computation of continued fractions and its applications

Lin Shun-Shii 《Journal of scientific computing》1994,9(1):65-80

A chained-matrices approach for parallel computing thenth convergent of continued fractions is presented. The resulting algorithm computes the entire prefix values of any continued fraction inO(logn) time on the EREW PRAM model or a network withO(n/logn) processors connected by the cube-connectedcycles, binary tree, perfect shuffle, or hypercube. It can be applied to approximate the transcendental numbers, such as ande, inO(logm) time by usingO(m/logm) processors for a result withm-digit precision. We also use it to costoptimally solve the second-order linear recurrence, the polynomial evaluation, the recurrence of vector norm, the general class of recurrence equation defined by Kogge and Stone (1973), and the generalmth order linear recurrence. It is easy to implement because there are only some matrix multiplications and a division operation involved.This work was supported in part by National Science Council of the Republic of China under Contract NSC 77-0408-E002-09. 相似文献

14.

Efficient parallel algorithms forr-dominating set andp-center problems on trees

Xin He Yaacov Yesha 《Algorithmica》1990,5(1):129-145

We develop efficient parallel algorithms for ther-dominating set and thep-center problems on trees. On a concurrent-read exclusive-write PRAM, our algorithm for ther-dominating set problem runs inO(logn log logn) time withn processors. The algorithm for thep-center problem runs inO(log² n log logn) time withn processors.Xin He was supported in part by an Ohio State University Presidential Fellowship, and by the Office of Research and Graduate Studies of Ohio State University. Yaacov Yesha was supported in part by the National Science Foundation under Grant No. DCR-8606366. 相似文献

15.

Optimal Computing the Chessboard Distance Transform on Parallel Processing Systems

Yu-Hua Lee Shi-Jinn Horng 《Computer Vision and Image Understanding》1999,73(3):272

Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N²) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications. 相似文献

16.

Simulating Shared Memory in Real Time: On the Computation Power of Reconfigurable Architectures

Artur Czumaj Friedhelm Meyer auf der Heide Volker Stemann 《Information and Computation》1997,137(2):103

We consider randomized simulations of shared memory on a distributed memory machine (DMM) where thenprocessors and thenmemory modules of the DMM are connected via a reconfigurable architecture. We first present a randomized simulation of a CRCW PRAM on a reconfigurable DMM having a complete reconfigurable interconnection. It guarantees delay (log *n), with high probability. Next we study a reconfigurable mesh DMM (RM-DMM). Here thenprocessors andnmodules are connected via ann×nreconfigurable mesh. It was already known that ann×mreconfigurable mesh can simulate in constant time ann-processor CRCW PRAM with shared memory of sizem. In this paper we present a randomized step by step simulation of a CRCW PRAM with arbitrarily large shared memory on an RM-DMM. It guarantees constant delay with high probability, i.e., it simulates in real time. Finally we prove a lower bound showing that sizeΩ(n²) for the reconfigurable mesh is necessary for real time simulations. 相似文献

17.

Efficient Graph-Theoretic Algorithms on a Linear Array with a Reconfigurable Pipelined Bus System 总被引：1，自引：0，他引：1

Amitava Datta 《The Journal of supercomputing》2002,23(2):193-211

We present efficient algorithms for solving several fundamental graph-theoretic problems on a Linear Array with a Reconfigurable Pipelined Bus System (LARPBS), one of the recently proposed models of computation based on optical buses. Our algorithms include finding connected components, minimum spanning forest, biconnected components, bridges and articulation points for an undirected graph. We compute the connected components and minimum spanning forest of a graph in O(log n) time using O(m+n) processors where m and n are the number of edges and vertices in the graph and m=O(n ²) for a dense graph. Both the processor and time complexities of these two algorithms match the complexities of algorithms on the Arbitrary and Priority CRCW PRAM models which are two of the strongest PRAM models. The algorithms for these two problems published by Li et al. [7] have been considered to be the most efficient on the LARPBS model till now. Their algorithm [7] for these two problems require O(log n) time and O(n ³/log n) processors. Hence, our algorithms have the same time complexity but require less processors. Our algorithms for computing biconnected components, bridges and articulation points of a graph run in O(log n) time on an LARPBS with O(n ²) processors. No previous algorithm was known for these latter problems on the LARPBS. 相似文献

18.

Parallel methods for visibility and shortest-path problems in simple polygons

Michael T. Goodrich Steven B. Shauck Sumanta Guha 《Algorithmica》1992,8(1):461-486

In this paper we give efficient parallel algorithms for solving a number of visibility and shortest-path problems for simple polygons. Our algorithms all run inO(logn) time and are based on the use of a new data structure for implicitly representing all shortest paths in a simple polygonP, which we call thestratified decomposition tree. We use this approach to derive efficient parallel methods for computing the visibility ofP from an edge, constructing the visibility graph of the vertices ofP (using an output-sensitive number of processors), constructing the shortest-path tree from a vertex ofP, and determining all-farthest neighbors for the vertices inP. The computational model we use is the CREW PRAM.This research was announced in preliminary form in theProceedings of the 6th ACM Symposium on Computational Geometry, 1990, pp. 73–82. The research of Michael T. Goodrich was supported by the National Science Foundation under Grants CCR-8810568 and CCR-9003299, and by the NSF and DARPA under Grant CCR-8908092. 相似文献

19.

Efficient Parallel Algorithms for Permutation Graphs

《Journal of Parallel and Distributed Computing》1995,26(1):116-124

In this paper, we present optimal O(log n) time, O(n/log n) processor EREW PRAM parallel algorithms for finding the connected components, cut vertices, and bridges of a permutation graph. We also present an O(log n) time, O(n) processor, CREW PRAM model parallel algorithm for finding a Breadth First Search (BFS) spanning tree of a permutation graph rooted at vertex 1 and use the same to derive an efficient parallel algorithm for the All Pairs Shortest Path problem on permutation graphs. 相似文献

20.

Deterministic parallel list ranking

Richard J. Anderson Gary L. Miller 《Algorithmica》1991,6(1-6):859-868

In this paper we describe a simple parallel algorithm for list ranking. The algorithm is deterministic and runs inO(logn) time on an EREW PRAM withn/logn processors. The algorithm matches the performance of the Cole-Vishkin [CV3] algorithm but is simple and has reasonable constant factors. 相似文献