首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We give an optimal algorithm that broadcasts on an n-dimensional hypercube in O(n/ log2 (n+1)) routing steps with wormhole, e-cube routing and all-port communication. Previously, the best algorithm of P.K. McKinley and C. Trefftz (1993) requires [n/2] routing steps. We also give routing algorithms that achieve tight time bounds for n ⩽7  相似文献   

2.
We discuss the problem of packing hypercubes into an n-dimensional star graph S(n), which consists of embedding a disjoint union of hypercubes U into S(n) with load one. Hypercubes in U have from [n/2] to (n+1)·[log2 n]-2([lod2n]+1)+2 dimensions, i.e., they can be as large as any hypercube which can be embedded with dilation at most four into S(n). We show that U can be embedded into S(n) with optimal expansion, which contrasts with the growing expansion ratios of previously known techniques. We employ several performance metrics to show that, with our techniques, a star graph can efficiently execute heterogeneous workloads containing hypercube, mesh, and star graph algorithms. The characterization of our packings includes some important metrics which have not been addressed by previous research (namely, average dilation, average congestion, and congestion). Our packings consistently produce small average congestion and average dilation, which indicates that the induced communication slowdown is also small. We consider several combinations of node mapping functions and routing algorithms in S(n), and obtain their corresponding performance metrics using either mathematical analysis or computer simulation  相似文献   

3.
Based on the V.E. Mendia and D. Sarkar's algorithm (1992), we propose an optimal and nonredundant distributed broadcasting algorithm in star graphs. For an n-dimensional star graph, our algorithm takes O(n log2 n) time and guarantees that all nodes in the star graph receive the message exactly once. Moreover, broadcasting m packets in a pipeline fashion takes O(m log2 n+n log2 n) time due to the nonredundant property of our broadcasting algorithm  相似文献   

4.
This paper presents a family of algorithms for producing, from (υ0, υ1, ..., υn-1), all initial prefixes xi0&thetas;υ1 &thetas;···&thetas;υi (i=0, 1, ..., n-1) in parallel in interconnection networks such as the omega network and the hypercube, where &thetas; is an associative binary operator. Each algorithm can be embedded in the switches and interconnections of the network, and can be executed in O((log2 r+1) logr n) time steps provided that the network connecting n processors is constructed by using an r×r switch, and that parallelism within as well as among individual switches is exploited. The objective of these algorithms is to attain a communication pattern that fits the topology of the network. One type of network can be made equivalent to, or can be embedded in, another type of network, so a family of algorithms can be derived from one basic algorithm. In the basic algorithm, every processor pi upward multicasts υi to processors pk (k=i+1, i+2, ..., n - 1). En route to pi, υj (j=0, 1, ..., i - 1) are combined in the switches to produce the (i - 1)th initial prefix xi-1 that is received by pi, which can then compute the ith initial prefix xi=xi-1&thetas;υi  相似文献   

5.
A new broadcasting method is presented for hypercubes with wormhole routing mechanism. The communication model assumed allows an n-dimensional hypercube to have at most n concurrent 110 communications along its ports. It further assumes a distance insensitivity of (n+1) with no intermediate reception capability for the nodes along the communication path. The approach is based on determination of the set of nodes (called stations) in the hypercube such that for any node in the network there is a station at distance of at most 1. Once stations are identified, parallel disjoint paths are formed from the source to all stations. The broadcasting is accomplished first by sending the message to all stations which will in turn inform the rest of the nodes of the message. To establish node-disjoint paths between the source node and all stations, we introduce a new routing strategy. We prove that multicasting can be done in one routing step as long as the number of destination nodes are at most n in an n-dimensional hypercube. The number of broadcasting steps using our routing is equal to or smaller than that obtained in an earlier work; this number is optimal for all hypercube dimensions n⩽12, except for n=10  相似文献   

6.
Extended Fibonacci Cubes   总被引:1,自引:0,他引:1  
The Fibonacci Cube is an interconnection network that possesses many desirable properties that are important in network design and application. The Fibonacci Cube can efficiently emulate many hypercube algorithms and uses fewer links than the comparable hypercube, while its size does not increase as fast as the hypercube. However, most Fibonacci Cubes (more than 2/3 of all) are not Hamiltonian. In this paper, we propose an Extended Fibonacci Cube (EFC1) with an even number of nodes. It is defined based on the same sequence F(i)=F(i-1)+F(i-2) as the regular Fibonacci sequence; however, its initial conditions are different. We show that the Extended Fibonacci Cube includes the Fibonacci Cube as a subgraph and maintains its sparsity property. In addition, it is Hamiltonian and is better in emulating other topologies. Specifically, the Extended Fibonacci Cube can embed binary trees more efficiently than the regular Fibonacci Cube and is almost as efficient as the hypercube, even though the Extended Fibonacci Cube is a much sparser network than the hypercube. We also propose a series of Extended Fibonacci Cubes with even number of nodes. Any Extended Fibonacci Cube (EFCk, with k⩾) in the series contains the node set of any other cube that precedes EFCk in the series. We show that any Extended Fibonacci Cube maintains virtually all the desirable properties of the Fibonacci Cube. The EFCks can be considered as flexible versions of incomplete hypercubes, which eliminates their restriction on the number of nodes, and, thus, makes it possible to construct parallel machines with arbitrary sizes  相似文献   

7.
We consider the problem of embedding and reconfiguring binary tree structures in faulty hypercubes. We assume that the number of faulty nodes is at most (n-2), where n is the number of dimensions of the hypercube; we further assume that the location of faulty nodes are known. Our embedding techniques are based on a key concept called free dimension, which can be used to partition a cube into subcubes such that each subcube contains at most one faulty node. Using this approach, two distributed schemes are provided for embedding and reconfiguration in faulty hypercubes. We extend the free dimension concept to degree of occupancy and use this to develop a distributed scheme for reconfiguration of binary tree in faulty hypercubes with up to [3n/2] node faults  相似文献   

8.
Multicast communication, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is being increasingly demanded in parallel computing. System supported multicast services can potentially offer improved performance, increased functionality, and simplified programming, and may in turn be used to support various higher-level operations for data movement and global process control. This paper presents efficient algorithms to implement multicast communication in wormhole-routed direct networks, in the absence of hardware multicast support, by exploiting the properties of the switching technology. Minimum-time multicast algorithms are presented for n-dimensional meshes and hypercubes that use deterministic, dimension-ordered routing of unicast messages. Both algorithms can deliver a multicast message to m-1 destinations in [log 2 m] message passing steps, while avoiding contention among the constituent unicast messages. Performance results of implementations on a 64-node nCUBE-2 hypercube and a 168-node Symult 2010 2-D mesh are given  相似文献   

9.
We introduce and analyze a new interconnection topology, called the k-dimensional folded Petersen (FPk) network, which is constructed by iteratively applying the Cartesian product operation on the well-known Petersen graph. Since the number of nodes in FPk is restricted to a power of ten, for better scalability we propose a generalization, the folded Petersen cube network FPQn,k =Qn×FPk, which is a product of the n-dimensional binary hypercube (Qn) and FPk. The FPQn,k topology provides regularity, node- and edge-symmetry, optimal connectivity (and therefore maximal fault-tolerance), logarithmic diameter, modularity, and permits simple self-routing and broadcasting algorithms. With the same node-degree and connectivity, FPQ n,k has smaller diameter and accommodates more nodes than Q n+3k, and its packing density is higher compared to several other product networks. This paper also emphasizes the versatility of the folded Petersen cube networks as a multicomputer interconnection topology by providing embeddings of many computationally important structures such as rings, multi-dimensional meshes, hypercubes, complete binary trees, tree machines, meshes of trees, and pyramids. The dilation and edge-congestion of all such embeddings are at most two  相似文献   

10.
The star graph interconnection network has been recognized as an attractive alternative to the hypercube network. Previously, the star graph has been shown to contain a Hamiltonian cycle. In this paper, we consider an injured star graph with some faulty links and nodes. We show that even with fe⩽n-3 faulty links, a Hamiltonian cycle still can be found in an n-star, and that with fv⩽n-3 faulty nodes, a ring containing at most 4fv nodes less than that in a Hamiltonian cycle can be found (i.e. the ring contains at least n!-4fv nodes). In general, in an n-star with fe faulty links and fv faulty nodes, where fe+fv⩽n-3, our embedding is able to establish a ring containing at least n!-4fv nodes  相似文献   

11.
文中将具有2n个顶点的Mobius立方体的拓扑结构加以改变,得到了包含任意个顶点的互连网络——超级Mobius立方体,并证明它保持了Mobius立方体的高连通度、对数级的直径和顶点度数等优良性质,并且当顶点个数N=2n+2n-1时,0-型超级Mobius立方体是一个(n+1)-正则图;更进一步地,由于它包含任意个顶点,所以其升级只需增加任意个顶点,从而克服了Mobius立方体的升级必须成倍增加其顶点个数的缺点.  相似文献   

12.
We describe a technique for the efficient processing of large, multidimensional arrays on a MIMD hypercube. The technique allows the hypercube to be used as a processor mesh whose relative dimension sizes may be changeddynamically, while always keeping adjacent array elements on the same node or on physically adjacent nodes. The technique is based on a mapping scheme, calledpermuted Gray code mapping, which is a generalization of the binary reflected Gray code mapping. We also extend the technique to allowinterleavingof the array data over the nodes of the hypercube. This technique can be used to efficiently parallelize scan-line algorithms, including operations such as volume rotation and volume rendering.  相似文献   

13.
A multiway merge sorting network is presented, which generalizes the technique used in the odd-even merge sorting network. The merging network described here is composed of m k-way mergers and a combining network. It arranges k ordered lists of length n each into one ordered lists in T(k)+[log2k] [log2m] [log2m] steps, where T(k) is the number of steps needed to sort k keys in order; and k and m are any integers no longer restricted to 2  相似文献   

14.
The problem of optimally distributing a divisible load to the nodes of an arbitrary processor tree is tackled in this paper. The rigorous mathematical foundation presented allows the derivation of the sequence of operations that is necessary to obtain the minimum processing time, along with closed-form expressions that yield the solution in time O(NP), where P is the number of tree nodes and N their maximum degree. The main contributions of this work are: (1) both load distribution and result collection overheads are considered, thus providing better resource utilization, and (2) arbitrary processor trees are examined in contrast with previous approaches that examined either complete homogeneous trees, or single level trees. Additionally, approximate algorithms for solving the problem of specifying the optimum subset of active processors for a given load, are presented and evaluated  相似文献   

15.
Given a sequence of n elements, the All Nearest Smaller Values (ANSV) problem is to find, for each element in the sequence, the nearest element to the left (right) that is smaller, or to report that no such element exists. Time and work optimal algorithms for this problem are known on all the PRAM models but the running time of the best previous hypercube algorithm is optimal only when the number of processors p satisfies 1⩽p⩽n/((lg3 n)(lg lg n)2). In this paper, we prove that any normal hypercube algorithm requires Ω(M) processors to solve the ANSV problem in O(lg n) time, and we present the first normal hypercube ANSV algorithm that is optimal for all values of n and p. We use our ANSV algorithm to give the first O(lg n)-time n-processor normal hypercube algorithms for triangulating a monotone polygon and for constructing a Cartesian tree  相似文献   

16.
The exchanged hypercube   总被引:2,自引:0,他引:2  
This paper presents the exchanged hypercube, a new interconnection network obtained by systematically removing links from a binary hypercube. It maintains several desirable properties of the binary hypercube yet with reduced interconnection complexity. We also introduce the extended binomial tree, a spanning tree of the exchanged hypercube that preserves many desirable properties of the original binomial tree. A fault-tolerant routing strategy is also proposed for the exchanged hypercube.  相似文献   

17.
Unicast in hypercubes with large number of faulty nodes   总被引:1,自引:0,他引:1  
Unicast in computer/communication networks is a one-to-one communication between a source node s and a destination node t. We propose three algorithms which find a nonfaulty routing path between s and t for unicast in the hypercube with a large number of faulty nodes. Given the n-dimensional hypercube Hn and a set F of faulty nodes, node uϵ Hn is called k-safe if u has at least k nonfaulty neighbors. The Hn is called k-safe if every node of Hn is k-safe. It has been known that for 0⩽k⩽n/2, a k-safe Hn is connected if |F|⩽2k(n-k)-1. Our first algorithm finds a nonfaulty path of length at most d(s,t)+4 in O(n) time for unicast between 1-safe s and t in the Hn with |F|⩽2n-3, where d(s,t) is the distance between s and t. The second algorithm finds a nonfaulty path of length at most d(s,t)+6 in O(n) time for unicast in the 2-safe Hn with |F|⩽4n-9. The third algorithm finds a nonfaulty path of length at most d(s,t)+O(k2) in time O(|F|+n) for unicast in the k-safe Hn with |F|⩽2k(n-k)-1 (0⩽k⩽n/2). The time complexities of the algorithms are optimal. We show that in the worst case, the length of the nonfaulty path between s and t in a k-safe Hn with |F|⩽2k(n-k)-1 is at least d(s,t)+2(k+1) for 0⩽k⩽n/2. This implies that the path lengths found by the algorithms for unicast in the 1-safe and 2-safe hypercubes are optimal  相似文献   

18.
This paper addresses the one-to-all broadcasting problem and the one-to-many broadcasting problem, usually simply called broadcasting and multicasting, respectively. Broadcasting is the information dissemination problem in which a node of a network sends the same piece of information to all the other nodes. Multicasting is a partial broadcasting in the sense that only a subset of nodes forms the destination set. Both operations have many applications in parallel and distributed computing. In this paper, we study these problems in both line model, and cut-through model. The former assumes long distance calls between nonneighboring processors. The latter strengthens the line model by taking into account the use of a routing function. Long distance calls are possible in circuit-switched and wormhole-routed networks, and also in many networks supporting optical facilities. In the line model, it is well known that one can compute in polynomial time a [log2n]-round broadcast or multicast protocol for any arbitrary network. Unfortunately such a protocol is often inefficient from a practical point of view because it does not use the resources of the network in a balanced way. In this paper, we present a new algorithm to compute broadcast or multicast protocols. This algorithm applies under both line and cut-through models. Moreover, it returns protocols that efficiently use the bandwidth of the network. From a complexity point of view, we also show that most of the optimization problems relative to the maximization of the efficiency of broadcast or multicast protocols in terms of switching time or vertex load are NP-complete. We have, however, derived polynomial efficient solutions for tree-networks  相似文献   

19.
20.
In this paper, we study parallel branch and bound on fine grained hypercube multiprocessors. Each processor in a fine grained system has only a very small amount of memory available. Therefore, current parallel branch and bound methods for coarse grained systems ( 1000 nodes) cannot be applied, since all these methods assume that every processor stores the path from the node it is currently processing back to the node where the process was created (the back-up path). Furthermore, the much larger number of processors available in a fine grained system makes it imperative that global information (e.g. the current best solution) is continuously available at every processor; otherwise the amount of unnecessary search would become intolerable. We describe an efficient branch-and-bound algorithm for fine grained hypercube multiprocessors. Our method uses a global scheme where all processors collectively store all back-up paths such that each processor needs to store only a constant amount of information. At each iteration of the algorithm, all current nodes may decide whether they need to create new children, be pruned, or remain unchanged. We describe an algorithm that, based on these decisions, updates the current back-up paths and distributes global information in O(log m) steps, where m is the current number of nodes. This method also includes dynamic allocation of search processes to processors and provides optimal load balancing. Even if very drastic changes in the set of current nodes occur, our load balancing mechanism does not suffer any slow down.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号