首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Practical parallel algorithms, based on classical sequential Union-Find algorithms for computing transitive closures of binary relations, are described and implemented for both shared memory and distributed memory parallel computers. By practical algorithms, we mean algorithms that are efficient for parallel systems with bounded numbers of processors as opposed to algorithms where the number of processors grows with the problem size. Transitive closures are useful for decomposing many applications problems into independent subproblems. The implementations were on an ENCORE Multimax shared memory machine and an NCUBE hypercube. Our implementations indicate that transitive closure computations are intrinsically difficult for distributed memory parallel machines because of the need for global information. By contrast, our results for shared memory machines exhibited excellent speedups.Supported in part by NSF Grant DCR-8619103, ONR contract N000-86-G-0202 and DOE Grant DE-FG02-85ER25001.Supported in part by RADC contract F30602-85-C-0303.Supported in part by RADC contract F30602-85-C-0303.  相似文献   

2.
The paper contains results of computational experiences with the following algorithms for finding the transitive closure of a digraph: (i) Warshall's algorithm [17], (ii) Purdom's algorithm [13], (iii) the modification of Yen's algorithm [14], and (iv) the new algorithms for finding the transitive closure [3, 4]. The tested digraphs were generated at random. The enclosed references contain all papers known to the authors concerning transitive closure algorithms.  相似文献   

3.
4.
The k-clique problem is a cornerstone of NP-completeness and parametrized complexity. When k is a fixed constant, the asymptotically fastest known algorithm for finding a k-clique in an n-node graph runs in O(n0.792k) time (given by Nešet?il and Poljak). However, this algorithm is infamously inapplicable, as it relies on Coppersmith and Winograd's fast matrix multiplication.We present good combinatorial algorithms for solving k-clique problems. These algorithms do not require large constants in their runtime, they can be readily implemented in any reasonable random access model, and are very space-efficient compared to their algebraic counterparts. Our results are the following:
We give an algorithm for k-clique that runs in O(nk/(εlogn)k−1) time and O(nε) space, for all ε>0, on graphs with n nodes. This is the first algorithm to take o(nk) time and O(nc) space for c independent of k.
Let k be even. Define a k-semiclique to be a k-node graph G that can be divided into two disjoint subgraphs U={u1,…,uk/2} and V={v1,…,vk/2} such that U and V are cliques, and for all i?j, the graph G contains the edge {ui,vj}. We give an time algorithm for determining if a graph has a k-semiclique. This yields an approximation algorithm for k-clique, in the following sense: if a given graph contains a k-clique, then our algorithm returns a subgraph with at least 3/4 of the edges in a k-clique.
  相似文献   

5.
Dynamic redistribution of arrays is required very often in programs on distributed presents efficient algorithms for redistribution between different cyclic(k) distributions, as defined in High Performance Fortran. We first propose special optimized algorithms for a cyclic(x) to cyclic(y) redistribution when x is a multiple of y, or y is a multiple of x. We then propose two algorithms, called the GCD method and the LCM method, for the general cyclic(x) to cyclic(y) redistribution when there is no particular relation between x and y. We have implemented these algorithms on the Intel Touchstone Delta, and find that they perform well for different array sizes and number of processors  相似文献   

6.
7.
Suppose we are given a set S of n (possibly intersecting) simple objects in the plane such that, for every pair of objects in S, the intersection of the boundaries of these two objects has O(1) connected components. We consider the problem of determining whether there exists a straight line that goes through every object in S. We give an O(n log n γ (n)) time algorithm for this problem, where γ(n) is a very slowly growing function of n. In many cases, our algorithm runs in O(n log n) time. Previously, only special cases of this problem were considered: the case when every object is a straight-line segment (Edelsbrunner et al., 1982), the case when the objects are equal-radius circles (Bajaj and Li, 1983), and the case when objects all maintain the same orientation (Edelsbrunner, 1985). All these cases follow from our general approach, which places no constraints on the size and/or configuration of the objects in S.  相似文献   

8.
9.
Efficient algorithms for the soft morphological operators   总被引:5,自引:0,他引:5  
This correspondence presents two soft morphological algorithms that process multiple images simultaneously. The first algorithm performs best when the structuring elements contain less than 19 points; whereas, the second algorithm should be used for larger structuring elements. Theoretical and experimental analyses show these algorithms are faster than the conventional algorithm  相似文献   

10.
Wang  Jie  Li  Linge  Zhi  Guoming  Zhang  Zuping  Zhang  Hao 《Multimedia Tools and Applications》2017,76(24):26581-26601
Multimedia Tools and Applications - HEVC, which is the latest video coding standard, resulting in much higher compression efficiency than any previous standards. It is expected to take the place of...  相似文献   

11.
A computer network serves distributed applications by communicating messages between their remote ends. Many such applications desire minimal delay for their messages. Beside this efficiency objective, allocation of the network capacity is also subject to the fairness constraint of not shutting off communication for any individual message. Processor Sharing (PS) is a de facto standard of fairness but provides significantly higher average delay than Shortest Remaining Processing Time (SRPT), which is an optimally efficient but unfair algorithm. In this paper, we explore efficient fair algorithms for message communication where fairness means that no message is delivered later than under PS. First, we introduce a slack system to characterize fair algorithms completely and develop efficient fair algorithms called Pessimistic Fair Sojourn Protocol (PFSP), Optimistic Fair Sojourn Protocol (OFSP), and Shortest Fair Sojourn (SFS). Then, we prove that a fair online algorithm does not assure minimal average delay attainable with fairness. Our analysis also reveals lower bounds on worst-case inefficiency of fair algorithms. We conduct extensive simulations for various distributions of message sizes and arrival times. During either temporary overload or steady-state operation, SFS and other newly proposed fair algorithms support SRPT-like efficiency and consistently provide much smaller average delay than PS.  相似文献   

12.
13.
COUPL+ is a programming environment for applications using unstructured and hybrid grids for numerical simulations. It automates parallelization by handling the partitioning of data and dependent data and maintaining halo interfaces and copy coherency. We explore some algorithms behind this package. A multi-level partitioning method is described which is effective in the presence of skewed data, solving the multi-set median-finding problem. Partitioning elements over a set of pre-partitioned nodes is explored and a novel method is suggested for reducing communication in the resulting distribution.  相似文献   

14.
Efficient parallel algorithms for graph problems   总被引:1,自引:0,他引:1  
We present an efficient technique for parallel manipulation of data structures that avoids memory access conflicts. That is, this technique works on the Exclusive Read/Exclusive Write (EREW) model of computation, which is the weakest shared memory, MIMD machine model. It is used in a new parallel radix sort algorithm that is optimal for keys whose values are over a small range. Using the radix sort and known results for parallel prefix on linked lists, we develop parallel algorithms that efficiently solve various computations on trees and unicycular graphs. Finally, we develop parallel algorithms for connected components, spanning trees, minimum spanning trees, and other graph problems. All of the graph algorithms achieve linear speedup for all but the sparsest graphs.Part of this work was done while the first author was at the University of Illinois, Urbana-Champaign, the second author was at Carnegie-Mellon University, and the third author was at the Hebrew University and the Courant Institute of Mathematical Sciences, New York University. A preliminary version of this work was presented at the 1986 International Conference on Parallel Processing.  相似文献   

15.
We present four polylog-time parallel algorithms for matching parentheses on an exclusive-read and exclusive-write (EREW) parallel random-access machine (PRAM) model. These algorithms provide new insights into the parentheses-matching problem. The first algorithm has a time complexity of O(log2 n) employing O(n/(log n)) processors for an input string containing n parentheses. Although this algorithm is not cost-optimal, it is extremely simple to implement. The remaining three algorithms, which are based on a different approach, achieve O(log n) time complexity in each case, and represent successive improvements. The second algorithm requires O(n) processors and working space, and it is comparable to the first algorithm in its ease of implementation. The third algorithm uses O(n/(log n)) processors and O(n log n) space. Thus, it is cost-optimal, but uses extra space compared to the standard stack-based sequential algorithm. The last algorithm reduces the space complexity to O(n) while maintaining the same processor and time complexities. Compared to other existing time-optimal algorithms for the parentheses-matching problem that either employ extensive pipelining or use linked lists and comparable data structures, and employ sorting or a linked list ranking algorithm as subroutines, the last two algorithms have two distinct advantages. First, these algorithms employ arrays as their basic data structures, and second, they do not use any pipelining, sorting, or linked list ranking algorithms  相似文献   

16.
Efficient algorithms for large-scale temporal aggregation   总被引:2,自引:0,他引:2  
The ability to model time-varying natures is essential to many database applications such as data warehousing and mining. However, the temporal aspects provide many unique characteristics and challenges for query processing and optimization. Among the challenges is computing temporal aggregates, which is complicated by having to compute temporal grouping. We introduce a variety of temporal aggregation algorithms that overcome major drawbacks of previous work. First, for small-scale aggregations, both the worst-case and average-case processing time have been improved significantly. Second, for large-scale aggregations, the proposed algorithms can deal with a database that is substantially larger than the size of available memory. Third, the parallel algorithm designed on a shared-nothing architecture achieves scalable performance by delivering nearly linear scale-up and speed-up, even at the presence of data skew. The contributions made in this paper are particularly important because the rate of increase in database size and response time requirements has out-paced advancements in processor and mass storage technology.  相似文献   

17.
The calculation of a perspective image of an object involves the position and orientation of the viewer. In many graphics applications the viewpoint and object are fixed, and an orientation is sought in which the object is “centered” in the field of view. Previous work has proposed that the viewing direction be the axis of the narrowest circular cone emanating from the viewpoint and containing the object, and has shown how this direction can be calculated based on the viewpoint and a set of n object points which includes the object's extreme points. This paper presents algorithms which efficiently accomplish this task, in O(n) average and O(n log(n)) worst-case time. The orientation problem is converted into a problem in spherical geometry, and the proposed algorithms are based on existing algorithms for the analogous plane geometry problems.  相似文献   

18.
We present an efficient technique for parallel manipulation of data structures that avoids memory access conflicts. That is, this technique works on the Exclusive Read/Exclusive Write (EREW) model of computation, which is the weakest shared memory, MIMD machine model. It is used in a new parallel radix sort algorithm that is optimal for keys whose values are over a small range. Using the radix sort and known results for parallel prefix on linked lists, we develop parallel algorithms that efficiently solve various computations on trees and “unicycular graphs.” Finally, we develop parallel algorithms for connected components, spanning trees, minimum spanning trees, and other graph problems. All of the graph algorithms achieve linear speedup for all but the sparsest graphs.  相似文献   

19.
Efficient scheduling algorithms based on heuristic functions are developed for scheduling a set of tasks on a multiprocessor system. The tasks are characterized by worst-case computation times, deadlines, and resources requirements. Starting with an empty partial schedule, each step of the search extends the current partial schedule by including one of the tasks yet to be scheduled. The heuristic functions used in the algorithm actively direct the search for a feasible schedule, i.e. they help choose the task that extends the current partial schedule. Two scheduling algorithms are evaluated by simulation. To extend the current partial schedule, one of the algorithms considers, at each step of the search, all the tasks that are yet to be scheduled as candidates. The second focuses its attention on a small subset of tasks with the shortest deadlines. The second algorithm is shown to be very effective when the maximum allowable scheduling overhead is fixed. This algorithm is hence appropriate for dynamic scheduling in real-time systems  相似文献   

20.
We consider the problem of efficiently performing a reduce-scatter operation in a message passing system. Reduce-scatter is the composition of an element-wise reduction on vectors of n elements initially held by n processors, with a scatter of the resulting vector among the processors. In this paper, we present two algorithms for the reduce-scatter operation, designed in LogGP. The first algorithm assumes an associative and commutative reduction operator and it is optimal in LogGP within a small constant factor. The second algorithm allows the reduction operator to be noncommutative, and it is asymptotically optimal when values to be combined are large arrays. To achieve these results, we developed a complete analysis of both algorithms in LogGP, including the derivation of lower bounds for the reduce-scatter operation, and the study of the m-item version of the problem, i.e., the case when the initial elements are vectors themselves. Reduce-scatter has been included as a collective operation in the MPI standard message passing library, and can be used, for instance, in parallel matrix-vector multiply when the matrix is decomposed by columns. To model a message passing system, we adopted the LogGP model, an extension of LogP that allows the modeling of messages of different length. While this choice makes the analysis somewhat more complex, it leads to more realistic results in the case of gather/scatter algorithms  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号