首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The paper presents parallel algorithms for solving Poisson equation at N2 mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N2 processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only.  相似文献   

2.
Functionally reconfigurable general purpose parallel machines (FRPM) could be reconfigured during the operation from SIMD to MIMD mode or vice versa (first aspect) and from one interconnection network to another according to the data storing order (second aspect). General purpose machines are considered in order to obtain an arbitrary data exchange between the processing elements they are built of. A model for describing such interconnection networks is presented. A full-information exchange network in introduced which is reconfigurable in a programming way to tree-, matrix-, cube-, linear-neighbourhood and FFT-network. Some schemes for constructing SIMD/MIMD reconfigurable machines are given. The usefullness of using FRMP for image processing and pattern recognition is discussed.  相似文献   

3.
This paper presents a parallel algorithm for computing the inversion of a dense matrix based on modified Jordan's elimination which requires fewer calculation steps than the standard one. The algorithm is proposed for the implementation on the linear array with a small to moderate number of processors which operate in a parallel-pipeline fashion. A communication between neighboring processors is achieved by a common memory module implemented as a FIFO memory module. For the proposed algorithm we define a task scheduling procedure and prove that it is time optimal. In order to compute the speedup and efficiency of the system, two definitions (Amdahl's and Gustafson's) were used. For the proposed architecture, involving two to 16 processors, estimated Gustafson's (Amdahl's) speedups are in the range 1.99 to 13.76 (1.99 to 9.69).  相似文献   

4.
Hypercube interconnection networks have been receiving considerable attention in the supercomputing environment. However, the number of processors must be exactly 2r for an r-cube complete hypercube. This restriction severely limits its applicability. In this paper, we address three variant hypercube topologies with more flexibility in system sizes, the labelled hypercubes Imr, IMr, and IAr. Incomplete hypercube Imr consists of an r-cube and an m-cube complete hypercubes; Imr is composed of 2r and Σm ε M 2m nodes; IAr comes from an r-cube complete hypercube which operates in a degraded manner and allows that the missing nodes to be arbitrarily distributed. Specifically, we focus on the parallel paths routing algorithms for these three classes of incomplete hypercubes. Parallel paths between any given two nodes mean that these paths have the same source and destination nodes but with different intermediate nodes. Parallel communication is important as it will allow us to use the full bandwidth of the multiprocessors for the data transfer operation between any two nodes, and3these redundant paths can increase system fault-tolerance and communication reliability. With these parallel routing algorithms, one can use them as a criterion to design multiprocessor systems.  相似文献   

5.
An efficient distributed algorithm for evaluating an iterative function on all pairwise combinations of C objects on an SIMD hypercube is presented. The algorithm achieves uniform load distribution and minimal, completely local interprocessor communication.  相似文献   

6.
A method is proposed for converting an algorithm admitting no parallel treatment into a new algorithm, in essence, with much better parallel properties. The method is intended for tackling the so called T-algorithms, the term ensuing from first examples of such algorithms concerned in the context of Toeplitz-like matrices. Generalized T-algorithms are also considered.  相似文献   

7.
It is shown that the relaxation labelling process of Rosenfeld, Hummel and Zucker is a suboptimal minimization of a cost function measuring inconsistency and ambiguity. Two new algorithms which minimize this cost function more efficiently are introduced. Finally, some general comments on relaxation are presented.  相似文献   

8.
This paper presents several algorithms for solving problems using massively parallel SIMD hypercube and shuffle-exchange computers. The algorithms solve a wide variety of problems, but they are related because they all use a common strategy. Specifically, all of the algorithms use a divide-and-conquer approach to solve a problem withN inputs using a parallel computer withP processors. The structural properties of the problem are exploited to assure that fewer thanN data items are communicated during the division and combination steps of the divide-and-conquer algorithm. This reduction in the amount of data that must be communicated is central to the efficiency of the algorithm.This paper addresses four problems, namely the multiple-prefix, data-dependent parallel-prefix, image-component-labeling, and closest-pair problems. The algorithms presented for the data-dependent parallel-prefix and closest-pair problems are the fastest known whenN P and the algorithms for the multiple-prefix and image-component-labeling problems are the fastest known whenN is sufficiently large with respect toP.This work was supported in part by our NSF Graduate Fellowship.  相似文献   

9.
10.
数字图像处理需要大量的数据运算,要求系统具有很高的数据吞吐量。并行处理结构能较好地满足这一要求。介绍一种SIMD并行多DSP数字图像处理系统。该系统具有避免冲突、能连续处理图像数据、处理器间通信及I/O部分简单、硬件及软件模块化等优点。  相似文献   

11.
Parallel processing, neural networks and genetic algorithms   总被引:4,自引:0,他引:4  
In an earlier paper[1] some recent developments in computational technology to structural engineering were described. The developments included: parallel and distributed computing; neural networks; and genetic algorithms. In this paper, the authors concentrate on parallel implementations of neural networks and genetic algorithms. In the final section of the paper the authors show how a parallel finite element analysis may be undertaken in an efficient manner by preprocessing of the finite element model using a genetic algorithm utilizing a neural network predictor. This preprocessing is the partitioning of the finite element mesh into sub-domains to ensure load balancing and minimum interprocessor communication during the parallel finite element analysis on a MIMD distributed memory computer. © 1998 Published by Elsevier Science Limited. All rights reserved.  相似文献   

12.
In this paper, we study the merging of two sorted arrays and on EREW PRAM with two restrictions: (1) The elements of two arrays are taken from the integer range [1,n], where n=Max(n 1,n 2). (2) The elements are taken from either uniform distribution or non-uniform distribution such that , for 1≤ip (number of processors). We give a new optimal deterministic algorithm runs in time using p processors on EREW PRAM. For ; the running time of the algorithm is O(log (g) n) which is faster than the previous results, where log (g) n=log log (g−1) n for g>1 and log (1) n=log n. We also extend the domain of input data to [1,n k ], where k is a constant.
Hazem M. BahigEmail:
  相似文献   

13.
A dominating set of an undirected graph G is a set D of nodes such that every node of G either is in D or is adjacent to some node of D. It is shown that the problem of finding a minimum cardinality dominating set is NP-complete for split graphs (a subclass of chordal graphs) and bipartite graphs.  相似文献   

14.
We give the first efficient parallel algorithms for solving the arrangement problem. We give a deterministic algorithm for the CREW PRAM which runs in nearly optimal bounds ofO (logn log* n) time andn 2/logn processors. We generalize this to obtain anO (logn log* n)-time algorithm usingn d /logn processors for solving the problem ind dimensions. We also give a randomized algorithm for the EREW PRAM that constructs an arrangement ofn lines on-line, in which each insertion is done in optimalO (logn) time usingn/logn processors. Our algorithms develop new parallel data structures and new methods for traversing an arrangement.This work was supported by the National Science Foundation, under Grants CCR-8657562 and CCR-8858799, NSF/DARPA under Grant CCR-8907960, and Digital Equipment Corporation. A preliminary version of this paper appeared at the Second Annual ACM Symposium on Parallel Algorithms and Architectures [3].  相似文献   

15.
胡珂  姜麟  刘海燕 《微计算机信息》2012,(4):165-167,159
遗传算法(Genetic Algorithm)是一类借鉴生物界的进化规律演化而来的随机化搜索方法,已经成功运用在很多大规模的组合优化问题中。利用如今流行的并行计算机系统,对遗传算法进行并行化,可解决标准遗传算法的速度瓶颈问题。本文在MPI并行环境下,用C++语言实现了粗粒度模型的并行遗传算法。结合并行遗传算法的特点,提出了解决物流配送路线优化的策略以及给出相应的算法过程,并进行了有效验证。通过研究结果表明,与传统遗传算法相比,并行遗传算法提高了运算速度,降低了平均开销时间并且最小总路径值更理想。  相似文献   

16.
We present efficient parallel algorithms for several basic problems in computational geometry: convex hulls, Voronoi diagrams, detecting line segment intersections, triangulating simple polygons, minimizing a circumscribing triangle, and recursive data-structures for three-dimensional queries.The work of C. Ó'Dúnlaing and C. Yap was supported by NSF Grants DCR-84-01898 and DCR-84-01633.  相似文献   

17.
This paper generalizes the widely used Nelder and Mead (Comput J 7:308–313, 1965) simplex algorithm to parallel processors. Unlike most previous parallelization methods, which are based on parallelizing the tasks required to compute a specific objective function given a vector of parameters, our parallel simplex algorithm uses parallelization at the parameter level. Our parallel simplex algorithm assigns to each processor a separate vector of parameters corresponding to a point on a simplex. The processors then conduct the simplex search steps for an improved point, communicate the results, and a new simplex is formed. The advantage of this method is that our algorithm is generic and can be applied, without re-writing computer code, to any optimization problem which the non-parallel Nelder–Mead is applicable. The method is also easily scalable to any degree of parallelization up to the number of parameters. In a series of Monte Carlo experiments, we show that this parallel simplex method yields computational savings in some experiments up to three times the number of processors.  相似文献   

18.
We present randomized and deterministic algorithms for many-to-one packet routing on an n-node two-dimensional mesh under the store-and-forward model. We consider the general instance of many-to-one routing where each node is the source (resp., destination) of ? (resp., k) packets, for arbitrary values of ? and k. All our algorithms run in optimal time and use queues of only constant size at each node to store packets in transit. The randomized algorithms, however, are simpler to implement. Our result closes a gap in the literature, where time-optimal algorithms using constant-size queues were known only for the special cases ?=1 and ?=k.  相似文献   

19.
A number of recent studies have revealed that the Optical Transpose Interconnection Systems (or OTIS) are promising candidates for future high-performance parallel computers. In this paper, we present and evaluate two general methods for algorithm development on the OTIS. The proposed methods are general in the sense that no specific factor network or problem domain is assumed. The proposed methods allow efficient mapping of a wide class of algorithms into the OTIS. These methods are based on grids and pipelines as popular structures that support a vast body of parallel applications including linear algebra, divide-and-conquer type of algorithms, sorting, and FFT computation. Timing models for measuring the performance of the proposed methods are also provided. Through these models, the performance of various algorithms on the OTIS are evaluated and compared with their counterparts on conventional electronic interconnection systems. This study confirms the viability of the OTIS as an attractive alternative for large-scale parallel architectures. Finally, we show how the proposed methods can be used to design parallel algorithms for linear algebra on the OTIS.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号