期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Reconstructing a Binary Tree from Its Traversals in Doubly Logarithmic CREW Time

《Journal of Parallel and Distributed Computing》1995,27(1):100-105

We consider the following problem. For a binary tree T = (V, E) where V = {1, 2, ..., n}, given its inorder traversal and either its preorder or its postorder traversal, reconstruct the binary tree. We present a new parallel algorithm for this problem. Our algorithm requires O(n) space. The main idea of our algorithm is to reduce the reconstruction process to merging two sorted sequences. With the best parallel merging algorithms, our algorithm can be implemented in O(log log n) time using O(n/log log n) processors on the CREW PRAM (or in O(log n) time using O(n/log n) processors on the EREW PRAM). Our result provides one more example of a fundamental problem which can be solved by optimal parallel algorithms in O(log log n)time on the CREW PRAM. 相似文献

2.

More Efficient Topological Sort Using Reconfigurable Optical Buses

Li Jie Pan Yi Shen Hong 《The Journal of supercomputing》2003,24(3):251-258

Topological sort of an acyclic graph has many applications such as job scheduling and network analysis. Due to its importance, it has been tackled on many models. Dekel et al. [3], proposed an algorithm for solving the problem in O(log² N) time on the hypercube or shuffle-exchange networks with O(N ³) processors. Chaudhuri [2], gave an O(log N) algorithm using O(N ³) processors on a CRCW PRAM model. On the LARPBS (Linear Arrays with a Reconfigurable Pipelined Bus System) model, Li et al. [5] showed that the problem for a weighted directed graph with N vertices can be solved in O(log N) time by using N ³ processors. In this paper, a more efficient topological sort algorithm is proposed on the same LARPBS model. We show that the problem can be solved in O(log N) time by using N ³/log N processors. We show that the algorithm has better time and processor complexities than the best algorithm on the hypercube, and has the same time complexity but better processor complexity than the best algorithm on the CRCW PRAM model. 相似文献

3.

Parallel integer sorting and simulation amongst CRCW models

Sanjeev Saxena 《Acta Informatica》1996,33(5):607-619

In this paper a general technique for reducing processors in simulation without any increase in time is described. This results in an O(√logn) time algorithm for simulating one step of PRIORITY on TOLERANT with processor-time product of O(n log logn); the same as that for simulating PRIORITY on ARBITRARY. This is used to obtain anO(logn/log logn + √logn (log logm ? log logn)) time algorithm for sortingn integers from the set {0,...,m ? 1},m ≧n, with a processor-time product ofO(n log logm log logn) on a TOLERANT CRCW PRAM. New upper and lower bounds for ordered chaining problem on an allocated COMMON CRCW model are also obtained. The algorithm for ordered chaining takesO(logn/log logn) time on an allocated PRAM of sizen. It is shown that this result is best possible (upto a constant multiplicative factor) by obtaining a lower bound of Ω(r logn/(logr + log logn)) for finding the first (leftmost one) live processor on an allocated-COMMON PRAM of sizen ofr-slow virtual processors (one processor simulatesr processors of allocated PRAM). As a result, for ordered chaining problem, “processor-time product” has to be at least Ω(n logn/log logn) for any poly-logarithmic time algorithm. Algorithm for ordered-chaining problem results in anO(logN/log logN) time algorithm for (stable) sorting ofn integers from the set {0,...,m ? 1} withn-processors on a COMMON CRCW PRAM; hereN = max(n, m). In particular if,m =n ^O(1), then sorting takes Θ(logn/log logn) time on both TOLERANT and COMMON CRCW PRAMs. Processor-time product for TOLERANT isO(n(log logn)²). Algorithm for COMMON usesn processors. 相似文献

4.

Optimal Computing the Chessboard Distance Transform on Parallel Processing Systems

Yu-Hua Lee Shi-Jinn Horng 《Computer Vision and Image Understanding》1999,73(3):272

Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N²) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications. 相似文献

5.

On Parallel Selection and Searching in Partial Orders: Sorted Matrices

R. Sarnath Xin He 《Journal of Parallel and Distributed Computing》1997,40(2):1051

Parallel algorithms for the problems of selection and searching on sorted matrices are formulated. The selection algorithm takesO(lognlog lognlog*n) time withO(n/lognlog*n) processors on an EREW PRAM. This algorithm can be generalized to solve the selection problem on a set of sorted matrices. The searching algorithm takesO(log logn) time withO(n/log logn) processors on a Common CRCW PRAM, which is optimal. We show that no algorithm using at mostnlog^cnprocessors,c≥ 1, can solve the matrix search problem in time faster than Ω(log logn) and that Ω(logn) steps are needed to solve this problem on any model that does not allow concurrent writes. 相似文献

6.

An Optimal Shortest Path Parallel Algorithm for Permutation Graphs

《Journal of Parallel and Distributed Computing》1995,24(1):94-99

We present an optimal parallel algorithm for the single-source shortest path problem for permutation graphs. The algorithm runs in O(log n) time using O(n/log n) processors on an EREW PRAM. As an application, we show that a minimum connected dominating set in a permutation graph can be found in O(log n) time using O(n/log n) processors. 相似文献

7.

Efficient Parallel Algorithms for Permutation Graphs

《Journal of Parallel and Distributed Computing》1995,26(1):116-124

In this paper, we present optimal O(log n) time, O(n/log n) processor EREW PRAM parallel algorithms for finding the connected components, cut vertices, and bridges of a permutation graph. We also present an O(log n) time, O(n) processor, CREW PRAM model parallel algorithm for finding a Breadth First Search (BFS) spanning tree of a permutation graph rooted at vertex 1 and use the same to derive an efficient parallel algorithm for the All Pairs Shortest Path problem on permutation graphs. 相似文献

8.

An optimal and processor efficient parallel sorting algorithm on a linear array with a reconfigurable pipelined bus system

Min He Xiaolong Wu Si Qing Zheng 《Computers & Electrical Engineering》2009,35(6):951-965

Optical interconnections attract many engineers and scientists’ attention due to their potential for gigahertz transfer rates and concurrent access to the bus in a pipelined fashion. These unique characteristics of optical interconnections give us the opportunity to reconsider traditional algorithms designed for ideal parallel computing models, such as PRAMs. Since the PRAM model is far from practice, not all algorithms designed on this model can be implemented on a realistic parallel computing system. From this point of view, we study Cole’s pipelined merge sort [Cole R. Parallel merge sort. SIAM J Comput 1988;14:770–85] on the CREW PRAM and extend it in an innovative way to an optical interconnection model, the LARPBS (Linear Array with Reconfigurable Pipelined Bus System) model [Pan Y, Li K. Linear array with a reconfigurable pipelined bus system—concepts and applications. J Inform Sci 1998;106;237–58]. Although Cole’s algorithm is optimal, communication details have not been provided due to the fact that it is designed for a PRAM. We close this gap in our sorting algorithm on the LARPBS model and obtain an O(log N)-time optimal sorting algorithm using O(N) processors. This is a substantial improvement over the previous best sorting algorithm on the LARPBS model that runs in O(log N log log N) worst-case time using N processors [Datta A, Soundaralakshmi S, Owens R. Fast sorting algorithms on a linear array with a reconfigurable pipelined bus system. IEEE Trans Parallel Distribut Syst 2002;13(3):212–22]. Our solution allows efficiently assign and reuse processors. We also discover two new properties of Cole’s sorting algorithm that are presented as lemmas in this paper. 相似文献

9.

Efficient parallel algorithms forr-dominating set andp-center problems on trees

Xin He Yaacov Yesha 《Algorithmica》1990,5(1-4):129-145

We develop efficient parallel algorithms for ther-dominating set and thep-center problems on trees. On a concurrent-read exclusive-write PRAM, our algorithm for ther-dominating set problem runs inO(logn log logn) time withn processors. The algorithm for thep-center problem runs inO(log² n log logn) time withn processors. 相似文献

10.

An efficient parallel strategy for the two-fixed-endpoint Hamiltonian path problem on distance-hereditary graphs

《Journal of Parallel and Distributed Computing》2004,64(5):662-685

In this paper, we solve the two-fixed-endpoint Hamiltonian path problem on distance-hereditary graphs efficiently in parallel. Let T_d(|V|,|E|) and P_d(|V|,|E|) denote the parallel time and processor complexities, respectively, required to construct a decomposition tree of a distance-hereditary graph G=(V,E) on a PRAM model M_d. We show that this problem can be solved in O(T_d(|V|,|E|)+log|V|) time using O(P_d(|V|,|E|)+(|V|+|E|)/log|V|) processors on M_d. Moreover, if G is represented by its decomposition tree form, the problem can be solved optimally in O(log|V|) time using O((|V|+|E|)/log|V|) processors on an EREW PRAM. We also obtain a linear-time algorithm which is faster than the previous known O(|V|³) sequential algorithm. 相似文献

11.

An optimal parallel algorithm for computing cut vertices and blocks on interval graphs

《国际计算机数学杂志》2012,89(1):59-70

In this paper, a parallel algorithm is presented to find all cut-vertices and blocks of an interval graph. If the list of sorted end points of the intervals of an interval graph is given then the proposed algorithm takes O(log n) time and O(n/log n) processors on an EREW PRAM, if the sorted list is not given then the time and processors complexities are respectively O(log n) and O(n). 相似文献

12.

Expected parallel time and sequential space complexity of graph and digraph problems

John Reif Paul Spirakis 《Algorithmica》1992,7(1-6):597-630

This paper determines upper bounds on the expected time complexity for a variety of parallel algorithms for undirected and directed random graph problems. For connectivity, biconnectivity, transitive closure, minimum spanning trees, and all pairs minimum cost paths, we prove the expected time to beO(log logn) for the CRCW PRAM (this parallel RAM machine allows resolution of write conflicts) andO(logn · log logn) for the CREW PRAM (which allows simultaneous reads but not simultaneous writes). We also show that the problem of graph isomorphism has expected parallel timeO(log logn) for the CRCW PRAM andO(logn) for the CREW PRAM. Most of these results follow because of upper bounds on the mean depth of a graph, derived in this paper, for more general graphs than was known before. For undirected connectivity especially, we present a new probabilistic algorithm which runs on a randomized input and has an expected running time ofO(log logn) on the CRCW PRAM, withO(n) expected number of processors only. Our results also improve known upper bounds on the expected space required for sequential graph algorithms. For example, we show that the problems of finding connected components, transitive closure, minimum spanning trees, and minimum cost paths have expected sequential spaceO(logn · log logn) on a deterministic Turing Machine. We use a simulation of the CRCW PRAM to get these expected sequential space bounds. 相似文献

13.

Efficient parallel term matching and anti-unification

Arthur L. Delcher Simon Kasif 《Journal of Automated Reasoning》1992,9(3):391-406

In this paper we present optimal processor x time parallel algorithms for term matching and anti-unification of terms represented as trees. Term matching is the special case of unification in which one of the terms is restricted to contain no variables. It has wide applicability to logic programming, term rewriting systems and symbolic pattern matching. Anti-unification is the dual problem of unification in which one computes the most specific generalization of two terms. It has application to inductive inference and theorem proving. Our algorithms run in O(log² N) time using N/log² N processors on a shared-memory model of computation that prohibits simultaneous reads or writes (EREW PRAM). These algorithms are the first polylogarithmic-time EREW algorithms with a processor x time product of the same order as that of their sequential counterparts, thereby permitting optimal speed-ups using any number of processors up to N/log² N. We also use the techniques developed in the paper to provide an N/log N-processor, O(log N)-time algorithm for a shared-memory model that allows both simultaneous reads and simultaneous writes (CRCW PRAM).Supported by NSF Grant IRI-88-09324 and NSF/DARPA Grant CCR-8908092. 相似文献

14.

Efficient and Scalable PRAM Algorithms for Discrete-Event Simulation of Bounded Degree Networks

《Journal of Parallel and Distributed Computing》1993,18(4):524-530

We describe two new parallel algorithms, one conservative and another optimistic, for discrete-event simulation on an exclusive-read exclusive-write parallel random-access machine (EREW PRAM). The target physical systems are bounded degree networks which are represented by logic circuits. Employing p processors, our conservative algorithm can simulate up to O(p) independent messages of a system with n logical processes in O(log n) time. The number of processors, p, can be optimally varied in the range 1 ≤ p ≤ n. To identify independent messages, this algorithm also introduces a novel scheme based on a variable size time window. Our optimistic algorithm is designed to reduce the rollback frequency and the memory requirement to save past states and messages. The optimistic algorithm also simulates O(p) earliest messages on a p-processor computer in O(log n) time. To our knowledge, such a theoretical efficiency in parallel simulation algorithms, conservative or optimistic, has been achieved for the first time. 相似文献

15.

A randomized sublinear time parallel GCD algorithm for the EREW PRAM

Jonathan P. Sorenson 《Information Processing Letters》2010,110(5):198-201

We present a randomized parallel algorithm that computes the greatest common divisor of two integers of n bits in length with probability 1−o(1) that takes O(nloglogn/logn) time using O(n6+?) processors for any ?>0 on the EREW PRAM parallel model of computation. The algorithm either gives a correct answer or reports failure.We believe this to be the first randomized sublinear time algorithm on the EREW PRAM for this problem. 相似文献

16.

Efficient Graph-Theoretic Algorithms on a Linear Array with a Reconfigurable Pipelined Bus System 总被引：1，自引：0，他引：1

Amitava Datta 《The Journal of supercomputing》2002,23(2):193-211

We present efficient algorithms for solving several fundamental graph-theoretic problems on a Linear Array with a Reconfigurable Pipelined Bus System (LARPBS), one of the recently proposed models of computation based on optical buses. Our algorithms include finding connected components, minimum spanning forest, biconnected components, bridges and articulation points for an undirected graph. We compute the connected components and minimum spanning forest of a graph in O(log n) time using O(m+n) processors where m and n are the number of edges and vertices in the graph and m=O(n ²) for a dense graph. Both the processor and time complexities of these two algorithms match the complexities of algorithms on the Arbitrary and Priority CRCW PRAM models which are two of the strongest PRAM models. The algorithms for these two problems published by Li et al. [7] have been considered to be the most efficient on the LARPBS model till now. Their algorithm [7] for these two problems require O(log n) time and O(n ³/log n) processors. Hence, our algorithms have the same time complexity but require less processors. Our algorithms for computing biconnected components, bridges and articulation points of a graph run in O(log n) time on an LARPBS with O(n ²) processors. No previous algorithm was known for these latter problems on the LARPBS. 相似文献

17.

Maintaining B-Trees on an EREW PRAM

《Journal of Parallel and Distributed Computing》1994,22(2):329-335

Efficient and practical algorithms for maintaining general B-trees on an EREW PRAM are presented. Given a B-tree of order b with m distinct records, the search (respectively, insert and delete) problem for n input keys is solved on an n-processor EREW PRAM in O(log n + b log_b m) (respectively, O(b(log n + log_b m)) and O(b²(log_b n + log_b m))) time. 相似文献

18.

A Linear Time Algorithm for Sequential Diagnosis in Hypercubes

《Journal of Parallel and Distributed Computing》1995,26(1):48-53

This paper describes a new sequential diagnosis algorithm for hypercubes. The algorithm is based on the PMC model and it assumes the existence of a central observer for syndrome decoding. If we denote the total number of processors in a given hypercube by N, then the algorithm achieves Θ([formula]) degree of diagnosability using only O(N) tests over all iterations of diagnosis and repair. The aggregated syndrome decoding time is also shown to be O(N) for this algorithm. The number of iterations of diagnosis and repair needed by the algorithm is O(log N). 相似文献

19.

A Parallel Algorithm for the Visibility of a Simple Polygon Using Scan Operations

《CVGIP: Graphical Models and Image Processing》1993,55(3):192-202

This paper describes a parallel algorithm for computing the visible portion of a simple planar polygon with N vertices from a given point on or inside the polygon. The algorithm accomplishes this in O(k log N) time using O(N/log N) processors, where k is the link-diameter of the polygon in consideration. The link-diameter of a polygon is the maximum number of straight line segments needed to connect any two points within the polygon, where all line segments lie completely within the polygon. The algorithm can also be used to compute the visible portion of the plane given a point outside of the polygon. Except in this case, the parameter k in the asymptotic bounds would be the link diameter of a different polygon. The algorithm is optimal for sets of polygons that have a constant link diameter. It is a rather simple algorithm, and has a very small run time constant, making it fast and practical to implement. The interprocessor communication needed involves only local neighbor communication and scan operations (i.e., parallel prefix operations). Thus the algorithm can be implemented not only on an EREW PRAM, but also on a variety of other more practical machine architectures, such as hypercubes, trees, butterflies, and shuffle exchange networks. The algorithm was implemented on the Connection Machine as well as the MasPar MP- 1, and various performance tests were conducted. 相似文献

20.

Renaming and Dispersing: Techniques for Fast Load Balancing

《Journal of Parallel and Distributed Computing》1994,23(2):149-157

Consider the following situation: n processors of a PRAM are given n independent tasks. Each task can be executed in constant time by a single processor. The distribution of tasks among the processors is unknown; each processor has information only about its set of tasks. The batch execution problem is to reschedule the tasks so that the quickest execution of all tasks is achieved. This problem captures some basic cooperation obstacles of the PRAM model since, without rescheduling overhead, all the tasks can be completed in O(1) time. The solution presented in this paper is an O(lg lg n) time load balancing algorithm which achieves, with overwhelming probability, an almost flat distribution, i.e., O(1) tasks for each processor. We introduce two novel techniques which are of an independent interest: renaming-a randomized scheme for an approximate compaction, and dispersing-a gather-scatter paradigm for distribution smoothing. The model of computation used in the CRCW PRAM. The concurrent-write submodel is ROBUST; i.e., if two or more processors write into the same cell at the same time then no prediction can be made about the cell contents. 相似文献