期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Marian Vajter&#x;ic 《Parallel Computing》1984,1(3-4):325-330

The paper presents parallel algorithms for solving Poisson equation at N² mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N² processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only. 相似文献

2.

Functionally reconfigurable general purpose parallel machines and some image processing and pattern recognition applications

Nikola K Kasabov 《Pattern recognition letters》1985,3(3):215-223

Functionally reconfigurable general purpose parallel machines (FRPM) could be reconfigured during the operation from SIMD to MIMD mode or vice versa (first aspect) and from one interconnection network to another according to the data storing order (second aspect). General purpose machines are considered in order to obtain an arbitrary data exchange between the processing elements they are built of. A model for describing such interconnection networks is presented. A full-information exchange network in introduced which is reconfigurable in a programming way to tree-, matrix-, cube-, linear-neighbourhood and FFT-network. Some schemes for constructing SIMD/MIMD reconfigurable machines are given. The usefullness of using FRMP for image processing and pattern recognition is discussed. 相似文献

3.

An optimal scheduling procedure for matrix inversion on linear array at a processor level

M. K. Stojčev E. I. Milovanović I. Ž. Milovanović 《International journal of parallel programming》1994,22(4):435-448

This paper presents a parallel algorithm for computing the inversion of a dense matrix based on modified Jordan's elimination which requires fewer calculation steps than the standard one. The algorithm is proposed for the implementation on the linear array with a small to moderate number of processors which operate in a parallel-pipeline fashion. A communication between neighboring processors is achieved by a common memory module implemented as a FIFO memory module. For the proposed algorithm we define a task scheduling procedure and prove that it is time optimal. In order to compute the speedup and efficiency of the system, two definitions (Amdahl's and Gustafson's) were used. For the proposed architecture, involving two to 16 processors, estimated Gustafson's (Amdahl's) speedups are in the range 1.99 to 13.76 (1.99 to 9.69). 相似文献

4.

Parallel routing algorithms for incomplete hypercube interconnection networks 总被引：1，自引：0，他引：1

M. S. Horng D. J. Chen Kuo-Lung Ku 《Parallel Computing》1994,20(12):1739-1761

Hypercube interconnection networks have been receiving considerable attention in the supercomputing environment. However, the number of processors must be exactly 2^r for an r-cube complete hypercube. This restriction severely limits its applicability. In this paper, we address three variant hypercube topologies with more flexibility in system sizes, the labelled hypercubes I_m^r, I_M^r, and I_A^r. Incomplete hypercube I_m^r consists of an r-cube and an m-cube complete hypercubes; I_m^r is composed of 2^r and Σ_{m ε M} 2^m nodes; I_Ar comes from an r-cube complete hypercube which operates in a degraded manner and allows that the missing nodes to be arbitrarily distributed. Specifically, we focus on the parallel paths routing algorithms for these three classes of incomplete hypercubes. Parallel paths between any given two nodes mean that these paths have the same source and destination nodes but with different intermediate nodes. Parallel communication is important as it will allow us to use the full bandwidth of the multiprocessors for the data transfer operation between any two nodes, and3these redundant paths can increase system fault-tolerance and communication reliability. With these parallel routing algorithms, one can use them as a criterion to design multiprocessor systems. 相似文献

5.

Distributed evaluation of an iterative function for all object pairs on an SIMD hypercube

Fikret Er al 《Information Processing Letters》1991,40(6):341-345

An efficient distributed algorithm for evaluating an iterative function on all pairwise combinations of C objects on an SIMD hypercube is presented. The algorithm achieves uniform load distribution and minimal, completely local interprocessor communication. 相似文献

6.

New approaches to deriving parallel algorithms

Evgenij E. Tyrtyshnikov 《Parallel Computing》1990,15(1-3):261-265

A method is proposed for converting an algorithm admitting no parallel treatment into a new algorithm, in essence, with much better parallel properties. The method is intended for tackling the so called T-algorithms, the term ensuing from first examples of such algorithms concerned in the context of Toeplitz-like matrices. Generalized T-algorithms are also considered. 相似文献

7.

An optimization approach to relaxation labelling algorithms

SA Lloyd 《Image and vision computing》1983,1(2):85-91

It is shown that the relaxation labelling process of Rosenfeld, Hummel and Zucker is a suboptimal minimization of a cost function measuring inconsistency and ambiguity. Two new algorithms which minimize this cost function more efficiently are introduced. Finally, some general comments on relaxation are presented. 相似文献

8.

Data reduction and fast routing: A strategy for efficient algorithms for message-passing parallel computers

Jorge L. C. Sanz Robert Cypher 《Algorithmica》1992,7(1):77-89

This paper presents several algorithms for solving problems using massively parallel SIMD hypercube and shuffle-exchange computers. The algorithms solve a wide variety of problems, but they are related because they all use a common strategy. Specifically, all of the algorithms use a divide-and-conquer approach to solve a problem withN inputs using a parallel computer withP processors. The structural properties of the problem are exploited to assure that fewer thanN data items are communicated during the division and combination steps of the divide-and-conquer algorithm. This reduction in the amount of data that must be communicated is central to the efficiency of the algorithm.This paper addresses four problems, namely the multiple-prefix, data-dependent parallel-prefix, image-component-labeling, and closest-pair problems. The algorithms presented for the data-dependent parallel-prefix and closest-pair problems are the fastest known whenN P and the algorithms for the multiple-prefix and image-component-labeling problems are the fastest known whenN is sufficiently large with respect toP.This work was supported in part by our NSF Graduate Fellowship. 相似文献

9.

Parallel Algorithms Development for Programmable Devices with Application from Cryptography

Issam W. Damaj 《International journal of parallel programming》2007,35(6):529-572

相似文献

10.

一种SIMD多DSP数字图像处理系统研究与设计

李勇齐同斌张瑞生《电子技术应用》2007,33(11):71-73

数字图像处理需要大量的数据运算,要求系统具有很高的数据吞吐量。并行处理结构能较好地满足这一要求。介绍一种SIMD并行多DSP数字图像处理系统。该系统具有避免冲突、能连续处理图像数据、处理器间通信及I/O部分简单、硬件及软件模块化等优点。相似文献

11.

Parallel processing, neural networks and genetic algorithms 总被引：4，自引：0，他引：4

B.H.V. Topping J. Sziveri A. Bahreinejad J.P.B. Leite B. Cheng 《Advances in Engineering Software》1998,29(10):763-786

In an earlier paper[1] some recent developments in computational technology to structural engineering were described. The developments included: parallel and distributed computing; neural networks; and genetic algorithms. In this paper, the authors concentrate on parallel implementations of neural networks and genetic algorithms. In the final section of the paper the authors show how a parallel finite element analysis may be undertaken in an efficient manner by preprocessing of the finite element model using a genetic algorithm utilizing a neural network predictor. This preprocessing is the partitioning of the finite element mesh into sub-domains to ensure load balancing and minimum interprocessor communication during the parallel finite element analysis on a MIMD distributed memory computer. © 1998 Published by Elsevier Science Limited. All rights reserved. 相似文献

12.

Parallel merging with restriction

Hazem M. Bahig 《The Journal of supercomputing》2008,43(1):99-104

In this paper, we study the merging of two sorted arrays and on EREW PRAM with two restrictions: (1) The elements of two arrays are taken from the integer range [1,n], where n=Max(n ₁,n ₂). (2) The elements are taken from either uniform distribution or non-uniform distribution such that , for 1≤i≤p (number of processors). We give a new optimal deterministic algorithm runs in time using p processors on EREW PRAM. For ; the running time of the algorithm is O(log ^(g) n) which is faster than the previous results, where log ^(g) n=log log ^(g−1) n for g>1 and log ⁽¹⁾ n=log n. We also extend the domain of input data to [1,n ^k], where k is a constant.

Hazem M. BahigEmail:

相似文献

13.

Parallel strong orientation of an undirected graph

Mikhail J. Atallah 《Information Processing Letters》1984,18(1):37-39

A dominating set of an undirected graph G is a set D of nodes such that every node of G either is in D or is adjacent to some node of D. It is shown that the problem of finding a minimum cardinality dominating set is NP-complete for split graphs (a subclass of chordal graphs) and bipartite graphs. 相似文献

14.

Parallel algorithms for arrangements

R. Anderson P. Beanie E. Brisson 《Algorithmica》1996,15(2):104-125

We give the first efficient parallel algorithms for solving the arrangement problem. We give a deterministic algorithm for the CREW PRAM which runs in nearly optimal bounds ofO (logn log^* n) time andn ²/logn processors. We generalize this to obtain anO (logn log^* n)-time algorithm usingn ^d/logn processors for solving the problem ind dimensions. We also give a randomized algorithm for the EREW PRAM that constructs an arrangement ofn lines on-line, in which each insertion is done in optimalO (logn) time usingn/logn processors. Our algorithms develop new parallel data structures and new methods for traversing an arrangement.This work was supported by the National Science Foundation, under Grants CCR-8657562 and CCR-8858799, NSF/DARPA under Grant CCR-8907960, and Digital Equipment Corporation. A preliminary version of this paper appeared at the Second Annual ACM Symposium on Parallel Algorithms and Architectures [3]. 相似文献

15.

基于并行遗传算法的配送路线求解

胡珂姜麟刘海燕《微计算机信息》2012,(4):165-167,159

遗传算法(Genetic Algorithm)是一类借鉴生物界的进化规律演化而来的随机化搜索方法,已经成功运用在很多大规模的组合优化问题中。利用如今流行的并行计算机系统,对遗传算法进行并行化,可解决标准遗传算法的速度瓶颈问题。本文在MPI并行环境下,用C++语言实现了粗粒度模型的并行遗传算法。结合并行遗传算法的特点,提出了解决物流配送路线优化的策略以及给出相应的算法过程,并进行了有效验证。通过研究结果表明,与传统遗传算法相比,并行遗传算法提高了运算速度,降低了平均开销时间并且最小总路径值更理想。相似文献

16.

Parallel computational geometry

A. Aggarwal B. Chazelle L. Guibas C. Ó'Dúnlaing C. Yap 《Algorithmica》1988,3(1):293-327

We present efficient parallel algorithms for several basic problems in computational geometry: convex hulls, Voronoi diagrams, detecting line segment intersections, triangulating simple polygons, minimizing a circumscribing triangle, and recursive data-structures for three-dimensional queries.The work of C. Ó'Dúnlaing and C. Yap was supported by NSF Grants DCR-84-01898 and DCR-84-01633. 相似文献

17.

A Parallel Implementation of the Simplex Function Minimization Routine

Donghoon Lee Matthew Wiswall 《Computational Economics》2007,30(2):171-187

This paper generalizes the widely used Nelder and Mead (Comput J 7:308–313, 1965) simplex algorithm to parallel processors. Unlike most previous parallelization methods, which are based on parallelizing the tasks required to compute a specific objective function given a vector of parameters, our parallel simplex algorithm uses parallelization at the parameter level. Our parallel simplex algorithm assigns to each processor a separate vector of parameters corresponding to a point on a simplex. The processors then conduct the simplex search steps for an improved point, communicate the results, and a new simplex is formed. The advantage of this method is that our algorithm is generic and can be applied, without re-writing computer code, to any optimization problem which the non-parallel Nelder–Mead is applicable. The method is also easily scalable to any degree of parallelization up to the number of parameters. In a series of Monte Carlo experiments, we show that this parallel simplex method yields computational savings in some experiments up to three times the number of processors. 相似文献

18.

Optimal many-to-one routing on the mesh with constant queues

Andrea Pietracaprina 《Information Processing Letters》2005,96(1):24-29

We present randomized and deterministic algorithms for many-to-one packet routing on an n-node two-dimensional mesh under the store-and-forward model. We consider the general instance of many-to-one routing where each node is the source (resp., destination) of ? (resp., k) packets, for arbitrary values of ? and k. All our algorithms run in optimal time and use queues of only constant size at each node to store packets in transit. The randomized algorithms, however, are simpler to implement. Our result closes a gap in the literature, where time-optimal algorithms using constant-size queues were known only for the special cases ?=1 and ?=k. 相似文献

19.

Generalized methods for algorithm development on optical systems

A. Al-Ayyoub A. Awwad K. Day M. Ould-Khaoua 《The Journal of supercomputing》2006,38(2):111-125

A number of recent studies have revealed that the Optical Transpose Interconnection Systems (or OTIS) are promising candidates for future high-performance parallel computers. In this paper, we present and evaluate two general methods for algorithm development on the OTIS. The proposed methods are general in the sense that no specific factor network or problem domain is assumed. The proposed methods allow efficient mapping of a wide class of algorithms into the OTIS. These methods are based on grids and pipelines as popular structures that support a vast body of parallel applications including linear algebra, divide-and-conquer type of algorithms, sorting, and FFT computation. Timing models for measuring the performance of the proposed methods are also provided. Through these models, the performance of various algorithms on the OTIS are evaluated and compared with their counterparts on conventional electronic interconnection systems. This study confirms the viability of the OTIS as an attractive alternative for large-scale parallel architectures. Finally, we show how the proposed methods can be used to design parallel algorithms for linear algebra on the OTIS. 相似文献

20.

Approximating maximum edge 2-coloring in simple graphs via local improvement

Zhi-Zhong Chen Ruka Tanahashi 《Theoretical computer science》2009

相似文献