期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A randomized parallel branch-and-bound algorithm

Virendra K. Janakiram Edward F. Gehringer Dharma P. Agrawal Ravi Mehrotra 《International journal of parallel programming》1988,17(3):277-301

Randomized algorithms are algorithms that employ randomness in their solution method. We show that the performance of randomized algorithms is less affected by factors that prevent most parallel deterministic algorithms from attaining their theoretical speedup bounds. A major reason is that the mapping of randomized algorithms onto multiprocessors involves very little scheduling or communication overhead. Furthermore, reliability is enhanced because the failure of a single processor leads only to degradation, not failure, of the algorithm. We present results of an extensive simulation done on a multiprocessor simulator, running a randomized branch-and-bound algorithm. The particular case we consider is the knapsack problem, due to its ease of formulation. We observe the largest speedups in precisely those problems that take large amounts of time to solve. This work has been supported by the U.S. Army Research Office under Contract No. DAAG 29-85-K-0236. 相似文献

2.

Towards a single model of efficient computation in real parallel machines

Pilar de la Torre Clyde P Kruskal 《Future Generation Computer Systems》1992,8(4):395-408

We propose a model of parallel computation, the YPRAM, that allows general parallel algorithms to be designed for a wide class of parallel models. The basic model captures locality among processors, which is measured as a function of two parameters; latency and bandwidth.

We design YPRAM algorithms for solving several fundamental problems: parallel prefix, sorting, sorting numbers from a bounded range, and list ranking. We show that our model predicts, reasonably accurately, the actual known performances of several basic parallel models — PRAM, hypercube, mesh and tree — when solving these problems. 相似文献

3.

Deterministic parallel backtrack search

Kieran T. Herley Andrea Pietracaprina Geppino Pucci 《Theoretical computer science》2002,270(1-2):309-324

The backtrack search problem involves visiting all the nodes of an arbitrary binary tree given a pointer to its root subject to the constraint that the children of a node are revealed only after their parent is visited. We present a fast, deterministic backtrack search algorithm for a p-processor COMMON CRCW-PRAM, which visits any n-node tree of height h in time O((n/p+h)(logloglogp)²). This upper bound compares favourably with a natural Ω(n/p+h) lower bound for this problem. Our approach embodies novel, efficient techniques for dynamically assigning tree-nodes to processors to ensure that the work is shared equitably among them. 相似文献

4.

Locality-preserving hash functions for general purpose parallel computation

A. Chin 《Algorithmica》1994,12(2-3):170-181

Consider the problem of efficiently simulating the shared-memory parallel random access machine (PRAM) model on massively parallel architectures with physically distributed memory. To prevent network congestion and memory bank contention, it may be advantageous to hash the shared memory address space. The decision on whether or not to use hashing depends on (1) the communication latency in the network and (2) the locality of memory accesses in the algorithm.We relate this decision directly to algorithmic issues by studying the complexity of hashing in the Block PRAM model of Aggarwal, Chandra, and Snir, a shared-memory model of parallel computation which accounts for communication locality. For this model, we exhibit a universal family of hash functions having optimal locality. The complexity of applying these hash functions to the shared address space of the Block PRAM (i.e., by permuting data elements) is asymptotically equivalent to the complexity of performing a square matrix transpose, and this result is best possible for all pairwise independent universal hash families. These complexity bounds provide theoretical evidence that hashing and randomized routing need not destroy communication locality, addressing an open question of Valiant.This work was started when the author was a student at Oxford University, supported by a National Science Foundation Graduate Fellowship and a Rhodes Scholarship. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of the National Science Foundation or the Rhodes Trust. 相似文献

5.

Line-segment intersection reporting in parallel

Christine Rüb 《Algorithmica》1992,8(1):119-144

In this paper we give a parallel algorithm for line-segment intersection reporting in the plane. It runs in timeO(((n +k) logn log logn)/p) usingp processors on a concurrent-read-exclusive-write (CREW)-PRAM, wheren is the number of line segments,k is the number of intersections, andp n +k.This work was supported by the DFG, SFB 124, TP B2, VLSI Entwurfsmethoden und Parallelität. 相似文献

6.

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(1):21-34

Interval orders are partial orders defined by having interval representations. It is well known that a transitively oriented digraph G is an interval order iff its (undirected) complement G¯ is chordal. We investigate parallel algorithms Tot the following scheduling problem: Given a system consisting of a set T of n tasks (each requiring unit execution time) and an interval order ≺ over T, and given m identical parallel processors, construct an optimal (i.e., minimal length) schedule for (T, ≺).

Our algorithm is based on a subroutine for computing so-called scheduling distances, i.e., the minimal number of time steps needed to schedule all those tasks succeeding some given task t and preceding some other task t ¹. For a given interval order with n tasks, these scheduling distances can be computed using n ³ processors and O(log²n) time on a CREW-PRAM. We then give an incremental version of the scheduling distance algorithm, which can be used to compute the empty slots in an optimal schedule. From these, we derive the optimal schedule, using no more resources than for the initial scheduling distance computation and considerably improving on previous work by Sunder and He.

The algorithm can also be extended to handle task systems which, in addition to interval order precedence constraints, have individual deadlines and/or release times for the tasks. Our algorithm is the first NC-algorithm for this problem. As another application, it also provides NC-algorithms for some graph problems on interval graphs (which are NP-complete in general). 相似文献

7.

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(3-4):165-176

We present a cost-optimal parallel algorithm for the maximum matching problem on bipartite permutation graphs on an EREW PRAM. Previously, Chen and Yesha have dealt with this problem. Their solution relies on Dekel and Sahni's matching algorithm for convex bipartite graphs, which runs in O(log² n) time using O(n) processors. Given a permutation diagram, our algorithm runs in O(log n) time by using O(n/log n) processors. Our method starts with an easily understood greedy algorithm. We define a nontrivial binary operation which is associative and equivalent to the greedy algorithm. Thus parallel prefix can be applied to the problem. 相似文献

8.

Models for practical parallel computation

D. B. Skillicorn 《International journal of parallel programming》1991,20(2):133-158

A major reason for the lack of practical use of parallel computers has been the absence of a suitable model of parallel computation. Many existing models are either theoretical or are tied to a particular architecture. A more general model must be architecture independent, must realistically reflect execution costs, and must reduce the cognitive overhead of managing massive parallelism. A growing number of models meeting some of these goals have been suggested. We discuss their properties and relative strengths and weaknesses. We conclude that data parallelism is a style with much to commend it, and discuss the Bird-Meertens formalism as a coherent approach to data parallel programming.This work was supported by the Natural Sciences and Engineering Research Council of Canada. 相似文献

9.

Parallel Self-Index Integer Sorting

Hazem M. Bahig Sameh S. Daoud Mahmoud K. A. Khairat 《The Journal of supercomputing》2002,22(3):269-275

We consider the problem of sorting n integers when the elements are drawn from the restricted domain [1...n]. A new deterministic parallel algorithm for sorting n integers is obtained. Its running time is O(lognlog(n/logn)) using n/logn processors on EREW (exclusive read exclusive write) PRAM (parallel random access machine). Also, our algorithm was modified to become optimal when we use processors. This algorithm belongs to class EP (Efficient, Polynomial fast). 相似文献

10.

A Randomized Linear-Work EREW PRAM Algorithm to Find a Minimum Spanning Forest

Chung Keung Poon Vijaya Ramachandran 《Algorithmica》2008,35(3):257-268

We present a randomized EREW PRAM algorithm to find a minimum spanning forest in a weighted undirected graph. On an n -vertex graph the algorithm runs in o(( log n) ¹⁺ ^ɛ ) expected time for any ɛ >0 and performs linear expected work. This is the first linear-work, polylog-time algorithm on the EREW PRAM for this problem. This also gives parallel algorithms that perform expected linear work on two general-purpose models of parallel computation—the QSM and the BSP. 相似文献

11.

A constant time algorithm for finding maxima on reconfigurable bus systems using fewer processors

CSR Krishnan C Siva Ram 《Microprocessors and Microsystems》1993,17(10):607-610

相似文献

12.

The Fork95 parallel programming language: Design, implementation, application

Christoph W. Keßler Helmut Seidl 《International journal of parallel programming》1997,25(1):17-50

Fork95 is an imperative parallel programming language intended to express algorithms for synchronous shared memory machines (PRAMs). It is based on ANSI C and offers additional constructs to hierarchically divide processor groups into subgroups and manage shared and private address subspaces. Fork95 makes the assembly-level synchronicity of the underlying hardware available to the programmer at the language level. Nevertheless, it supports locally asynchronous computation where desired by the programmer. We present a one pass compiler, fcc, which compiles Fork95 and C programs to the SB-PRAM machine. The SB-PRAM is a lock-step synchronous, massively parallel multiprocessor currently being built at Saarbrücken University, with a physically shared memory and uniform memory access time. We examine three important types of parallel computation frequently used for the parallel solution of real-world problems. While farming and parallel divide-and-conquer are directly supported by Fork95 language constructs, pipelining can be easily expressed using existing language features; an additional language construct for pipelining is not required. 相似文献

13.

Optimal parallel algorithms for multiple updates of minimum spanning trees

Shaunak Pawagi Owen Kaser 《Algorithmica》1993,9(4):357-381

Parallel updates of minimum spanning trees (MSTs) have been studied in the past. These updates allowed a single change in the underlying graph, such as a change in the cost of an edge or an insertion of a new vertex. Multiple update problems for MSTs are concerned with handling more than one such change. In the sequential case multiple update problems may be solved using repeated applications of an efficient algorithm for a single update. However, for efficiency reasons, parallel algorithms for multiple update problems must consider all changes to the underlying graph simultaneously. In this paper we describe parallel algorithms for updating an MST whenk new vertices are inserted or deleted in the underlying graph, when the costs ofk edges are changed, or whenk edge insertions and deletions are performed. For multiple vertex insertion update, our algorithm achieves time and processor bounds ofO(log n·logk) and nk/(logn·logk), respectively, on a CREW parallel random access machine. These bounds are optimal for dense graphs. A novel feature of this algorithm is a transformation of the previous MST andk new vertices to a bipartite graph which enables us to obtain the above-mentioned bounds. 相似文献

14.

Predicting the Performance of Randomized Parallel Search: An Application to Robot Motion Planning

Daniel J. Challou Maria Gini Vipin Kumar George Karypis 《Journal of Intelligent and Robotic Systems》2003,38(1):31-53

In this paper we discuss methods for predicting the performance of any formulation of randomized parallel search, and propose a new performance prediction method that is based on obtaining an accurate estimate of the k-processor run-time distribution. We show that the k-processor prediction method delivers accurate performance predictions and demonstrate the validity of our analysis on several robot motion planning problems. 相似文献

15.

Optimal algorithms for the single and multiple vertex updating problems of a minimum spanning tree

D. B. Johnson P. Metaxas 《Algorithmica》1996,16(6):633-648

The vertex updating problem for a minimum spanning tree (MST) is defined as follows: Given a graphG=(V, E _G) and an MSTT forG, find a new MST forG to which a new vertexz has been added along with weighted edges that connectz with the vertices ofG. We present a set of rules that produce simple optimal parallel algorithms that run inO(lgn) time usingn/lgn EREW PRAM processors, wheren=¦V¦. These algorithms employ any valid tree-contraction schedule that can be produced within the stated resource bounds. These rules can also be used to derive simple linear-time sequential algorithms for the same problem. The previously best-known parallel result was a rather complicated algorithm that usedn processors in the more powerful CREW PRAM model. Furthermore, we show how our solution can be used to solve the multiple vertex updating problem: Update a given MST whenk new vertices are introduced simultaneously. This problem is solved inO(lgk·lgn) parallel time using (k·n)/(lgk·lgn) EREW PRAM processors. This is optimal for graphs having (kn) edges.Part of this work was done while P. Metaxas was with the Department of Mathematics and Computer Science, Dartmouth College. 相似文献

16.

A parallel algorithm for approximate regularity

Laurence Boxer Russ Miller 《Information Processing Letters》2001,80(6):311-316

Spatial regularity amidst a seemingly chaotic image is often meaningful. Many papers in computational geometry are concerned with detecting some type of regularity via exact solutions to problems in geometric pattern recognition. However, real-world applications often have data that is approximate, and may rely on calculations that are approximate. Thus, it is useful to develop solutions that have an error tolerance.

A solution has recently been presented by Robins et al. [Inform. Process. Lett. 69 (1999) 189–195] to the problem of finding all maximal subsets of an input set in the Euclidean plane that are approximately equally-spaced and approximately collinear. This is a problem that arises in computer vision, military applications, and other areas. The algorithm of Robins et al. is different in several important respects from the optimal algorithm given by Kahng and Robins [Patter Recognition Lett. 12 (1991) 757–764] for the exact version of the problem. The algorithm of Robins et al. seems inherently sequential and runs in O(n^5/2) time, where n is the size of the input set. In this paper, we give parallel solutions to this problem. 相似文献

17.

An NC algorithm for finding a minimum weighted completion time schedule on series parallel graphs

S. Sunder Xin He 《Algorithmica》1996,16(3):243-262

We present a parallel algorithm for solving the minimum weighted completion time scheduling problem for transitive series parallel graphs. The algorithm takesO(log² n) time withO(n ³) processors on a CREW PRAM, wheren is the number of vertices of the input graph. This is the first NC algorithm for solving the problem.Research supported in part by NSF Grants CCR-9011214 and CCR-9205982. 相似文献

18.

Randomized incremental construction of Delaunay and Voronoi diagrams 总被引：8，自引：0，他引：8

Leonidas J. Guibas Donald E. Knuth Micha Sharir 《Algorithmica》1992,7(1):381-413

In this paper we give a new randomized incremental algorithm for the construction of planar Voronoi diagrams and Delaunay triangulations. The new algorithm is more on-line than earlier similar methods, takes expected timeO(ngn) and spaceO(n), and is eminently practical to implement. The analysis of the algorithm is also interesting in its own right and can serve as a model for many similar questions in both two and three dimensions. Finally we demonstrate how this approach for constructing Voronoi diagrams obviates the need for building a separate point-location structure for nearest-neighbor queries.Leonidas Guibas and Micha Sharir wish to acknowledge the generous support of the DEC Systems Research Center in Palo Alto, California, where some of this work was carried out. Donald Knuth has been supported by NSF Grant CCR-86-10181. Micha Sharir has been supported by NSF Grant CCR-89-01484, ONR Grant N00014-K-87-0129, the U.S.-Israeli Binational Science Foundation, and the Fund for Basic Research administered by the Israeli Academy of Sciences. 相似文献

19.

Unified parallel encoding and decoding algorithms for Dandelion-like codes

Saverio Caminiti Rossella Petreschi 《Journal of Parallel and Distributed Computing》2010

The Dandelion-like codes are eight bijections between labeled trees and strings of node labels. The literature contains optimal sequential algorithms for these bijections, but no parallel algorithms have been reported. In this paper the first parallel encoding and decoding algorithms for Dandelion-like codes are presented. Namely, a unique encoding algorithm and a unique decoding algorithm, which when properly parameterized, can be used for all Dandelion-like codes, are designed. These algorithms are optimal in the sequential setting. The encoding algorithm implementation on an EREW PRAM is optimal, while the efficient implementation of the decoding algorithm requires concurrent reading. 相似文献

20.

The accelerated centroid decomposition technique for optimal parallel tree evaluation in logarithmic time

Richard Cole Uzi Vishkin 《Algorithmica》1988,3(1):329-346

A new general parallel algorithmic technique for computations on trees is presented. In particular, it provides the firstn/logn processor,O(logn)-time deterministic EREW PRAM algorithm for expression tree evaluation. The technique solves many other tree problems within the same complexity bounds.Richard Cole was supported in part by NSF Grants DCR-84-01633 and CCR-8702271, ONR Grant N00014-85-K-0046 and by an IBM faculty development award. Uzi Vishkin was supported in part by NSF Grants NSF-CCR-8615337 and NSF-DCR-8413359, ONR Grant N00014-85-K-0046, by the Applied Mathematical Science subprogram of the office of Energy Research, U.S. Department of Energy under Contract DE-AC02-76ER03077 and the Foundation for Research in Electronics, Computers and Communication, administered by the Israeli Academy of Sciences and Humanities. 相似文献