期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An adjustable linear time parallel algorithm for maximum weight bipartite matching

Morteza Fayyazi David Kaeli Waleed Meleis 《Information Processing Letters》2006,97(5):186-190

We present a parallel algorithm for finding a maximum weight matching in general bipartite graphs with an adjustable time complexity of using O(nmax(2ω,4+ω)) processing elements for ω?1. Parameter ω is not bounded. This is the fastest known strongly polynomial parallel algorithm to solve this problem. This is also the first adjustable parallel algorithm for the maximum weight bipartite matching problem in which the execution time can be reduced by an unbounded factor. We also present a general approach for finding efficient parallel algorithms for the maximum matching problem. 相似文献

2.

Efficient parallel processing with spin-wave nanoarchitectures

Mary M. Eshaghian-Wilner Shiva Navab 《The Journal of supercomputing》2009,49(2):248-267

In this paper, we study the algorithm design aspects of three newly developed spin-wave architectures. The architectures are capable of simultaneously transmitting multiple signals using different frequencies, and allow for concurrent read/write operations. Using such features, we show a number of parallel and fault-tolerant routing schemes and introduce a set of generic parallel processing techniques that can be used for design of fast algorithms on these spin-wave architectures. We also present a set of application examples to illustrate the operation of the proposed generic parallel techniques.

Mary M. Eshaghian-WilnerEmail:

相似文献

3.

Efficient parallel algorithms for path problems in directed graphs

Joan M. Lucas Marian Gunsher Sackrowitz 《Algorithmica》1992,7(1):631-648

In this paper we describe a technique for finding efficient parallel algorithms for problems on directed graphs that involve checking the existence of certain kinds of paths in the graph. This technique provides efficient algorithms for finding dominators in flow graphs, performing interval and loop analysis on reducible flow graphs, and finding the feedback vertices of a digraph. Each of these algorithms takesO(log² n) time using the same number of processors needed for fast matrix multiplication. All of these bounds are for an EREW PRAM. 相似文献

4.

Derivation of a parallel string matching algorithm

Jayadev Misra 《Information Processing Letters》2003,85(5):255-260

We derive an efficient parallel algorithm to find all occurrences of a pattern string in a subject string in O(logn) time, where n is the length of the subject string. The number of processors employed is of the order of the product of the two string lengths. The theory of powerlists [J. Kornerup, PhD Thesis, 1997; J. Misra, ACM Trans. Programming Languages Systems 16 (6) (1994) 1737-1740] is central to the development of the algorithm and its algebraic manipulations. 相似文献

5.

Efficient parallel algorithms for numerical simulation

David Mike 《Future Generation Computer Systems》2001,17(8):961-967

COUPL+ is a programming environment for applications using unstructured and hybrid grids for numerical simulations. It automates parallelization by handling the partitioning of data and dependent data and maintaining halo interfaces and copy coherency. We explore some algorithms behind this package. A multi-level partitioning method is described which is effective in the presence of skewed data, solving the multi-set median-finding problem. Partitioning elements over a set of pre-partitioned nodes is explored and a novel method is suggested for reducing communication in the resulting distribution. 相似文献

6.

Fast parallel algorithms for graph similarity and matching

Giorgos Kollias Madan Sathe Olaf Schenk Ananth Grama 《Journal of Parallel and Distributed Computing》2014

This paper addresses the problem of global graph alignment on supercomputer-class clusters. We define the alignment of two graphs, as a mapping of each vertex in the first graph to a unique vertex in the second graph so as to optimize a given similarity-based cost function.¹ Using a state of the art serial algorithm for the computation of vertex similarity scores called Network Similarity Decomposition (NSD), we derive corresponding parallel formulations. Coupling this parallel similarity algorithm with a parallel auction-based bipartite matching technique, we obtain a highly efficient and scalable graph matching pipeline. We validate the performance of our integrated approach on a large parallel platform and on diverse graph instances (including Protein Interaction, Wikipedia and Web networks). Experimental results demonstrate that our algorithms scale to large machine configurations (thousands of cores) and problem instances, enabling the alignment of networks of sizes two orders of magnitude larger than reported in the current literature. 相似文献

7.

Approximating weighted matchings in parallel

Stefan Hougardy Doratha E. Vinkemeier 《Information Processing Letters》2006,99(3):119-123

We present an NC approximation algorithm for the weighted matching problem in graphs with an approximation ratio of (1−ε). This improves the previously best approximation ratio of of an NC algorithm for this problem. 相似文献

8.

Efficient parallel algorithms for computing all pair shortest paths in directed graphs

Yijie Han V. Y. Pan J. H. Reif 《Algorithmica》1997,17(4):399-415

We present parallel algorithms for computing all pair shortest paths in directed graphs. Our algorithm has time complexityO(f(n)/p+I(n)logn) on the PRAM usingp processors, whereI(n) is logn on the EREW PRAM, log logn on the CCRW PRAM,f(n) iso(n ³). On the randomized CRCW PRAM we are able to achieve time complexityO(n ³/p+logn) usingp processors. A preliminary version of this paper was presented at the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, June 1992. Support by NSF Grant CCR 90-20690 and PSC CUNY Awards #661340 and #662478. 相似文献

9.

Research on the inertia matching of the Stewart parallel manipulator

Zhu-Feng Shao Xiaoqiang TangXu Chen Li-Ping Wang 《Robotics and Computer》2012

With the development of the parallel manipulator, inertia matching as an essential factor to realize good potentials of the parallel manipulator is taken serious gradually. However, neither definite inertia index nor inertia matching method has been proposed so far. In this paper, the above issues are discussed by taking the Stewart parallel manipulator as a study object. Firstly, adopting limb Jacobian matrices, the concise algebraic expression of the joint–space inertia matrix of the Stewart parallel manipulator is deduced, based on the dynamic modeling. Next, on the basis of the coupling analysis of the joint–space inertia matrix, the inertia index of the parallel manipulator, the Joint-Reflected Inertia, is proposed. Then, the practical inertia matching principles of the Stewart parallel manipulator are concluded on the basis of simulations, considering multiple factors, such as mechanical resonance frequency, acceleration torque and dynamic performance. Finally, the available range of the motor inertia is deduced, and the inertia matching of the Stewart parallel manipulator is finished as the case study. The inertia index and inertia matching method suggested in this paper can be further used in other parallel manipulators for dynamic analysis and motion system design. 相似文献

10.

Efficient parallel algorithms can be made robust

Paris C. Kanellakis Alex A. Shvartsman 《Distributed Computing》1992,5(4):201-217

Summary The efficient parallel algorithms proposed for many fundamental problems, such as list ranking, integer sorting and computing preorder numberings on trees, are very sensitive to processor failures. The requirement of efficiency (commonly formalized usingParallel-timexProcessors as a cost measure) has led to the design of highly tuned PRAM algorithms which, given the additional constraint of simple processor failures, unfortunately become inefficient or even incorrect. We propose a new notion ofrobustness, that combines efficiency with fault tolerance. For the common case of fail-stop errors, we develop a general and easy to implement technique to make robust many efficient parallel algorithms, e.g., algorithms for all the problems listed above. More specifically,for any dynamic pattern of fail-stop errors on a CRCW PRAMwith at least one surviving processor, our method increases the original algorithm cost by at most a log² multiplicative factor. Our technique is based on a robust solution of the problem ofWrite-All, i.e., usingP processors, write 1's in all locations of anN-sized array. In addition we show that at least a log/log log multiplicative overhead will be incurred for certain patterns of failures by any algorithm that implements robust solutions toWrite-All withP=N. However, by exploiting parallel slackness, we obtain an optimal cost algorithm when Paris C. Kanellakis is a professor of computer science at Brown University. His primary research interests are in the application of logic to computer science, such as high-level query languages for database systems, parallel evaluation of logic programs, and type inference for programming languages. He has published many articles on these subjects and is the author of the chapter Elements of Relational Database Theory in theHandbook of Theoretical Computer Science (Elsevier, 1990). He has also contributed to the theory of distributed computing and to combinatorial optimization. Alex Allister Shvartsman is an engineer at Digital Equipment Corporation. His professional interests include design and development of efficient distributed systems, distributed resource management, and theoretical foundations of fault-tolerant parallel computation. At Digital he architected and managed the development of distributed control systems that automated several of Digital's manufacturing processes. He is currently on an academic leave at Brown University.An extended abstract of a part of this work appears as: Kanellakis and Shvartsman (1989) in the Proceedings of the 8th ACM Symposium on Principles of Distributed Computing, Edmonton 1989This author was supported by NSF grant IRI-8617344, an Alfred P. Sloan fellowship and ONR grant N00014-83-K-0146 ARPA Order No. 4786This author was supported by DEC GEEP Doctoral program and ONR grant N00014-91-J-1613 相似文献

11.

A parallel hybrid greedy branch and bound scheme for the maximum distance-2 matching problem

Ioannis T. Christou Spyridon Vassilaras 《Computers & Operations Research》2013

We present a new highly parallel algorithm for fast determination of near-optimal solutions to the NP-hard problem of identifying a maximum distance-2 matching in arbitrary graphs. This problem, known as D2EMIS, has important applications such as determining the maximum capacity of the media access (MAC) layer in wireless ad-hoc networks [1]. It can also be seen as a maximum 2-packing problem [2] on the edge-to-vertex dual graph of the original graph. Our algorithm extends the GRASP [3] philosophy in that partial solutions are constructed by adding in a greedy adaptive manner the “best” nodes that can be found; however, when there are multiple alternatives that can be selected in an iteration, the algorithm branches into as many paths as there are (greedy) alternatives. The algorithm, using appropriate bounds to prune partial solutions that cannot be optimal, produces very fast near-optimal solutions that compare very well against other distributed algorithms and random greedy heuristics proposed before or variants thereof, or exact methods (Integer Programming or Maximum Satisfiability state-of-the-art solvers). 相似文献

12.

Efficient parallel algorithms for graph problems 总被引：1，自引：0，他引：1

Clyde P. Kruskal Larry Rudolph Marc Snir 《Algorithmica》1990,5(1):43-64

We present an efficient technique for parallel manipulation of data structures that avoids memory access conflicts. That is, this technique works on the Exclusive Read/Exclusive Write (EREW) model of computation, which is the weakest shared memory, MIMD machine model. It is used in a new parallel radix sort algorithm that is optimal for keys whose values are over a small range. Using the radix sort and known results for parallel prefix on linked lists, we develop parallel algorithms that efficiently solve various computations on trees and unicycular graphs. Finally, we develop parallel algorithms for connected components, spanning trees, minimum spanning trees, and other graph problems. All of the graph algorithms achieve linear speedup for all but the sparsest graphs.Part of this work was done while the first author was at the University of Illinois, Urbana-Champaign, the second author was at Carnegie-Mellon University, and the third author was at the Hebrew University and the Courant Institute of Mathematical Sciences, New York University. A preliminary version of this work was presented at the 1986 International Conference on Parallel Processing. 相似文献

13.

Processor Efficient Parallel Matching

Piotr Sankowski 《Theory of Computing Systems》2008,42(1):73-90

In this paper we consider the problem of finding perfect matchings in parallel. We present a RNC algorithm with almost optimal work with respect to sequential algorithms, i.e., it uses O(n ^ω) processors, where ω is the matrix multiplication exponent. Our algorithm is based on an RNC algorithm for computing determinant of a degree one polynomial matrix which is of independent interest. Research supported by KBN grant 1P03A01830. 相似文献

14.

Efficient algorithms for parallel sorting on mesh multicomputers

V. Singh V. Kumar G. Agha C. Tomlinson 《International journal of parallel programming》1991,20(2):95-131

We present two new parallel algorithms QSP1 and QSP2 based on sequential quicksort for sorting data on a mesh multicomputer, and analyze their scalability using the isoefficiency metric. We show that QSP2 matches the lower bound on the isoefficiency function for mesh multicomputers, while QSP1 is fairly close to optimal. Langet al. ⁽¹⁾ and Schnorret al. ⁽²⁾ have developed parallel sorting algorithms for the mesh architecture that have either optimal (Schnorr) or close to optimal (Lang) run-time complexity for the one-element-perprocessor case. Both QSP1 and QSP2 have better scalability than the scaled-down variants of these algorithms (for the case in which there are more elements than processors). We also analyze a different variant of Lang's sort which is as scalable as QSP2. We briefly discuss another metric called resource consumption. According to this metric, both QSP1 and QSP2 are superior to variants of Lang's sort. 相似文献

15.

A genetic algorithm and its parallelization for graph matching with similarity measures

Y. Wang N. Ishii 《Artificial Life and Robotics》1998,2(2):68-73

Graph matching and similarity measures of graphs have many applications to pattern recognition, machine vision in robotics, and similarity-based approximate reasoning in artificial intelligence. This paper proposes a method of matching and a similarity measure between two directed labeled graphs. We define the degree of similarity, the similar correspondence, and the similarity map which denotes the matching between the graphs. As an approximate computing method, we apply genetic algorithms (GA) to find a similarity map and compute the degree of similarity between graphs. For speed, we make parallel implementations in almost all steps of the GA. We have implemented the sequential GA and the parallel GA in C programs, and made simulations for both GAs. The simulation results show that our method is efficient and useful. This work was presented, in part, at the Second International Symposium on Artificial Life and Robotics, Oita Japan, February 18–20, 1997 相似文献

16.

Approximate swapped matching

Amihood Amir Moshe Lewenstein Ely Porat 《Information Processing Letters》2002,83(1):33-39

相似文献

17.

Efficient parallel simulation of a sliding window protocol

A. S. I. 《Performance Evaluation》2002,48(1-4):237-246

Two parallel algorithms for simulating a sliding window communication protocol are described. The first exploits the parallel generation of interarrival, service and transfer times, whilst the second is based on parallel prefix computations. Both algorithms allow the processors to work largely asynchronously with each other. The efficiency of the two approaches is evaluated experimentally, using an implementation on a cluster of 12 processors connected by fast Ethernet. 相似文献

18.

An improved parallel algorithm for integer GCD

Benny Chor Oded Goldreich 《Algorithmica》1990,5(1):1-10

We present a simple parallel algorithm for computing the greatest common divisor (gcd) of twon-bit integers in the Common version of the CRCW model of computation. The run-time of the algorithm in terms of bit operations isO(n/logn), usingn ¹⁺ processors, where is any positive constant. This improves on the algorithm of Kannan, Miller, and Rudolph, the only sublinear algorithm known previously, both in run time and in number of processors; they requireO(n log logn/logn),n ² log² n, respectively, in the same CRCW model.We give an alternative implementation of our algorithm in the CREW model. Its run-time isO(n log logn/logn), usingn ¹⁺ processors. Both implementations can be modified to yield the extended gcd, within the same complexity bounds.Supported in part by an IBM Graduate Fellowship and a Bantrell Postdoctoral Fellowship.Supported in part by a Weizmann Postdoctoral Fellowship.4 All logarithms are to base 2. 相似文献

19.

A new self-stabilizing maximal matching algorithm

Fredrik Manne Morten Mjelde Laurence Pilard Sébastien Tixeuil 《Theoretical computer science》2009

The maximal matching problem has received considerable attention in the self-stabilizing community. Previous work has given several self-stabilizing algorithms that solve the problem for both the adversarial and the fair distributed daemon, the sequential adversarial daemon, as well as the synchronous daemon. In the following we present a single self-stabilizing algorithm for this problem that unites all of these algorithms in that it has the same time complexity as the previous best algorithms for the sequential adversarial, the distributed fair, and the synchronous daemon. In addition, the algorithm improves the previous best time complexities for the distributed adversarial daemon from O(n²)

O (n^{2})

and O(δm)

O (δ m)

to O(m)

O (m)

where n

n

is the number of processes, m

m

is the number of edges, and δ

δ

is the maximum degree in the graph. 相似文献

20.

Deterministic parallel list ranking

Richard J. Anderson Gary L. Miller 《Algorithmica》1991,6(1):859-868

In this paper we describe a simple parallel algorithm for list ranking. The algorithm is deterministic and runs inO(logn) time on an EREW PRAM withn/logn processors. The algorithm matches the performance of the Cole-Vishkin [CV3] algorithm but is simple and has reasonable constant factors.R. J. Anderson was supported by an NSF Presidential Young Investigator award and G. L. Miller was supported by NSF Grant DCR-85114961. 相似文献