首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
In recent years, many multistage interconnection networks using 2 × 2 switching cells have been proposed for parallel architecture. Here we state a correct and easy graph characterization of all the networks topologically equivalent to the Omega, Flip, Baseline, Reverse Baseline, Indirect Binary Cube, and Modified Data Manipulator networks.  相似文献   

2.
A parallel heuristic algorithm for traffic control problems in three-stage connecting networks is presented in this paper. A three-stage connecting network consists of an input crossbar switching stage, an intermediate crossbar switching stage, and an output crossbar switching stage. The goal of our algorithm is to quickly and efficiently find a conflict-free switching assignment for communication demands through the network. The algorithm requires n2 × m processing elements for the network composed of n input/output switches and m intermediate switches, where it runs not only on a sequential machine, but also on a parallel machine with maximally n2 × m processors. The algorithm was verified by 1100 simulation runs with the network size from 102 × 7 to 502 × 27. The simulation results show that the algorithm can find a solution in nearly constant time with n2 × m processors.  相似文献   

3.
Network cost and fixed-degree characteristic for the graph are important factors to evaluate interconnection networks. In this paper, we propose hierarchical Petersen network (HPN) that is constructed in recursive and hierarchical structure based on a Petersen graph as a basic module. The degree of HPN(n) is 5, and HPN(n) has \(10^n\) nodes and \(2.5 \times 10^n\) edges. And we analyze its basic topological properties, routing algorithm, diameter, spanning tree, broadcasting algorithm and embedding. From the analysis, we prove that the diameter and network cost of HPN(n) are \(3\log _{10}N-1\) and \(15 \log _{10}N-1\), respectively, and it contains a spanning tree with the degree of 4. In addition, we propose link-disjoint one-to-all broadcasting algorithm and show that HPN(n) can be embedded into FP\(_k\) with expansion 1, dilation 2k and congestion 4. For most of the fixed-degree networks proposed, network cost and diameter require \(O(\sqrt{N})\) and the degree of the graph requires O(N). However, HPN(n) requires O(1) for the degree and \(O(\log _{10}N)\) for both diameter and network cost. As a result, the suggested interconnection network in this paper is superior to current fixed-degree and hierarchical networks in terms of network cost, diameter and the degree of the graph.  相似文献   

4.
Discrete Fourier transform (DFT) is an important tool in digital signal processing. In the present paper, we propose a novel approach to performing DFT. We transform DFT into a form expressed in discrete moments via a modular mapping and truncating Taylor series expansion. From this, we extend the use of our systolic array for fast computation of moments without any multiplications to one that computes DFT with only a few multiplications and without any evaluations of exponential functions. The multiplication number used in our method isO(Nlog2 N/ log2log2 N) superior toO(Nlog2 N) in FFT. The execution time of the systolic array is onlyO(Nlog2 N/ log2log2 N) for 1-D DFT andO(Nk) fork-D DFT (k⩾2). The systolic implementation is a demonstration of the locality of dataflow in the algorithms and hence it implies an easy and potential hardware/VLSI realization. The approach is also applicable to DFT inverses.  相似文献   

5.
This paper describes a multi-layer maze routing accelerator which uses a two-dimensional array of processing elements (PEs) implemented in an FPGA. Routing for an L-layer N×N grid is performed by an array of N×N PEs that time-multiplex each layer over the array. This accelerates the classic Lee Algorithm from O(L×d2) in software to O(L×d). Each PE can be implemented in 32 look up tables in a Xilinx Virtex-II FPGA, which makes possible routing arrays that are large enough to support detailed routing for VLSI. Cycle measurements show a speedup of 50–75× over a 2.54 GHz Pentium 4 for a 4-layer 8×8 array and 93× for a 4-layer 16×16 array.  相似文献   

6.
Aperiodic sorteris a sorting network that is a cascade of a number of identicalblocks, where outputiof each block is inputiof the next block. Previously, Dowd, Perl, Rudolph, and Saks introduced thebalancedmerging network, withN=2kinputs/outputs and log Nstages of comparators. Using a very intricate proof, they showed that a cascade of log Nsuch blocks constitutes a sorting network. In this paper, we introduce a large class of merging networks with the same periodic property. This class contains 2N/2−1networks, whereN=2kis the number of inputs. The balanced merger is one network in this class. Other networks use fewer comparators. We provide a very simple and elegant correctness proof based on the recursive structure of the networks.  相似文献   

7.
1 Introduction and related work In recent years, peer-to-peer computing has attracted significant attention from both industry field and academic field[1-3]. The core component of many proposed peer-to- peer systems is the distributed hash table (DHT) schemes[4,5] that use a hash table-like interface to publish and look up data objects. Many proposed DHT schemes[6-15] are based on some traditional interconnection to- pology: Chord[6], Tapestry[7,8], Pastry[9] are based on hypercube topolog…  相似文献   

8.
Hierarchical hypercubes (HHC), also known as cube-connected cubes, have been introduced in the literature as an interconnection network for massively parallel systems. Effectively, they can connect a large number of nodes while retaining a small diameter and a low degree compared to a hypercube of the same size. Especially (2 m +m)-dimensional hierarchical hypercubes ( $\mathit {HHC}_{2^{m}+m}$ ), called perfect HHCs, are popular as they are symmetrical, which is a critical property when designing routing algorithms. In this paper, we describe an algorithm finding, in an $\mathit{HHC}_{2^{m}+m}$ , mutually node-disjoint paths connecting k=?(m+1)/2? pairs of distinct nodes. This problem is known as the k-pairwise disjoint-path routing problem and is one of the important routing problems when dealing with interconnection networks. In an $\mathit{HHC}_{2^{m}+m}$ , our algorithm finds paths of lengths at most 2 m+1+m(2 m+1+1)+4 in O(25m ) time, where 2 m+1 is the diameter of an $\mathit{HHC}_{2^{m}+m}$ . Also, we have shown through an experiment that, in practice, the lengths of the generated paths are significantly lower than the worst-case theoretical estimations.  相似文献   

9.
Various sorting algorithms using parallel architectures have been proposed in the search for more efficient results. This paper introduces the Multi-Sort Algorithm for Multi-Mesh of Trees (MMT) Architecture for N=n 4 elements with more efficient time complexity compared to previous architectures. The shear sort algorithm on Single Instruction Multiple Data (SIMD) mesh model requires \(4\sqrt{N}+O\sqrt{N}\) time for sorting N elements, arranged on a \(\sqrt{N}\times \sqrt{N}\) mesh, whereas Multi-Sort algorithm on the SIMD Multi-Mesh (MM) Architecture takes O(N 1/4) time for sorting the same N elements, which proves that Multi-Sort is a better sorting approach. We have improved the time complexity of intrablock Sort. The Communication time complexity for 2D Sort in MM is O(n), whereas this time in MMT is O(log?n). The time complexity of compare–exchange step in MMT is same as that in MM, i.e., O(n). It has been found that the time complexity of the Multi-Sort on MMT has been improved as on Multi-Mesh architecture.  相似文献   

10.
We give an algorithm to compute the subset partial order (called the subset graph) for a family F of sets containing k sets with N elements in total and domain size n. Our algorithm requires O(nk2/logk) time and space on a Pointer Machine. When F is dense, i.e. N=Θ(nk), the algorithm requires O(N2/log2N) time and space. We give a construction for a dense family whose subset graph is of size Θ(N2/log2N), indicating the optimality of our algorithm for dense families. The subset graph can be dynamically maintained when F undergoes set insertions and deletions in O(nk/logk) time per update (that is sub-linear in N for the case of dense families). If we assume words of b?k bits, allow bits to be packed in words, and use bitwise operations, the above running time and space requirements can be reduced by a factor of blog(k/b+1)/logk and b2log(k/b+1)/logk respectively.  相似文献   

11.
We show how column sort and rotate sort can be implemented on the different reconfigurable mesh with buses (RMB) architectures that have been proposed in the literature. On all of these proposed RMB architectures, we are able to sort n numbers on an n × n configuration in O(1) time. For the PARBUS RMB architecture, our column sort and rotate sort implementations are simpler than the O(1) sorting algorithms developed in Jang and Prasanna (International Parallel Processing Symposium, 1992) and Lin et al. (Proceedings of Ninth European Workshop on Parallel Computing, 1992). Furthermore, our sorting algorithms use fewer bus broadcasts. For the RMESH RMB architecture, our algorithms are the first to sort n numbers on an n × n configuration in O(1) time. We also observe that rotate sort can be implemented on N × N × ··· × N (k + 1)-dimensional RMB architectures to sort Nk elements in O(1) time.  相似文献   

12.
We discuss the problem of scheduling af set of independent tasks T, each ti ϵ T of lenght ℓi ϵ Z+, on m identical processors. We allow preemption but assume a communication delay of time k ϵ N. Whenever a task is preempted from one processor to another, there must be a delay of at least k time units. We show that if k = 1, an optimal schedule can be found in polynomial time but if k ⩾ 2, the corresponding decision problem is NP-complete.  相似文献   

13.
To maintain high reliability and availability, system-level diagnosis should be considered for the multiprocessor systems. The self-diagnosis problem of hypermesh, emerging potential optical interconnection networks for multiprocessor systems, is solved in this paper. We derive that the precise one-step diagnosability of kn-hypermesh is n(k − 1). Based on the principle of cycle decomposition, a one-step t-fault diagnosis algorithm for kn-hypermesh which runs in O(knn(k − 1)) time also is described.  相似文献   

14.
互连网络拓扑等价的图分析法   总被引:9,自引:1,他引:8  
提出了描述互连网络拓扑等价的图分析法。获得了全交叉网络与基准,逆基准,Omega,flip,S=F=2SW榕树,简化数据变换等多级互连网络拓扑等价的逻辑名结构。阐明了用光学全交叉网络模拟实现上述网络的互连函数的原理及其多处理机,电信交换等领域的潜在应用。  相似文献   

15.
A new topology for interconnection networks has been proposed. The underlying network graph hasN= 4nnodes (n≥ 2) and isalmostregular with maximum degree 5 and diameter ≤ ⌊3/4 log2N⌋ + 1. Algorithms for point-to-point routing and single node broadcast have also been developed. It has also been shown that various algorithms for real life applications, e.g., matrix transpose, matrix multiplication, finding the sum/average/maximum/minimum of a set of data elements and ASCEND/DESCEND types of algorithms can be efficiently implemented on this topology. Finally, the underlying idea of constructing this network has been generalized to define a family ofalmost regularodd degree graphs of maximum degree 2j+ 1, (j> 2) withN= (2j)nnodes and diameter ⌊3/4 logjN⌋ + 1.  相似文献   

16.
In a large-scale parallel computing system, partitionability is an important feature that allows a large system to be partitioned into smaller subsystems so that multiple tasks can be run independently on different subsystems. In such a system, a task run on a subsystem may need to be migrated to another subsystem if the subsystem is faulty, if load balance is needed, or if subsystem restructuring is desired. This paper studies the partitionability of k-extra-stage Omega networks and presents an optimal task migration algorithm for k-extra-stage Omega networks.  相似文献   

17.
We study the generalized conjugate gradient scheme, based on the k-line and k × k block Jacobi splittings A = M ? N, for solving two-dimensional parabolic and elliptic difference equations AU = F?. Here A represents the difference operator chα ? h2Δh. Computational experiments suggest that the eigenvalues of K: = I ? M?1N cluster, and that the cluster radii decrease as k or chα increases. As is well known, clustering improves convergence of the conjugate gradient iterates. We discuss computational results for k = 4, 8, 16, 32 and for chα = 0, h, 2. Moreover, we establish the number and size of eigenvalue clusters for the model problem.  相似文献   

18.
Recursive cube of rings (RCR) networks [Y. Sun, P. Cheung, X. Lin, Recursive cube of rings: a new topology for interconnection networks, IEEE Trans. Parallel Dist. Syst. 11 (3) (2000) 275-286] can be widely used in the design and implementation of parallel processing architectures. In this paper, we investigate the routing of a message on RCR networks, that is a key to the performance of this network. We would like to transmit k+1 packets from a source node to a destination node simultaneously along paths on RCR networks, the ith packet will traverse along the ith path (1?i?k+1). In order for all packets to arrive at a destination node quickly and securely, the ith path must be node-disjoint from all other paths. For construction of these paths, employing Hamiltonian circuit Latin square (HCLS), we present O(n2) parallel routing algorithm on RCR networks.  相似文献   

19.
We present an efficient parameterized algorithm for the (k,t)-set packing problem, in which we are looking for a collection of k disjoint sets whose union consists of t elements. The complexity of the algorithm is O(2O(t)nNlogN). For the special case of sets of bounded size, this improves the O(k(ck)n) algorithm of Jia et al. [J. Algorithms 50 (1) (2004) 106].  相似文献   

20.
The ambiguity of a nondeterministic finite automaton (NFA) N for input size n is the maximal number of accepting computations of N for inputs of size n. For every natural number k we construct a family \((L_{r}^{k}\;|\;r\in \mathbb{N})\) of languages which can be recognized by NFA’s with size k?poly(r) and ambiguity O(n k ), but \(L_{r}^{k}\) has only NFA’s with size exponential in r, if ambiguity o(n k ) is required. In particular, a hierarchy for polynomial ambiguity is obtained, solving a long standing open problem (Ravikumar and Ibarra, SIAM J. Comput. 19:1263–1282, 1989, Leung, SIAM J. Comput. 27:1073–1082, 1998).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号