期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal broadcast in all-port wormhole-routed hypercubes

Ching-Tien Ho Ming-Yang Kao 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(2):200-204

We give an optimal algorithm that broadcasts on an n-dimensional hypercube in O(n/ log₂ (n+1)) routing steps with wormhole, e-cube routing and all-port communication. Previously, the best algorithm of P.K. McKinley and C. Trefftz (1993) requires [n/2] routing steps. We also give routing algorithms that achieve tight time bounds for n ⩽7 相似文献

2.

Low expansion packings and embeddings of hypercubes into stargraphs: a performance-oriented approach

Moraes de Azevedo M. Bagherzadeh N. Latifi S. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(3):261-274

We discuss the problem of packing hypercubes into an n-dimensional star graph S(n), which consists of embedding a disjoint union of hypercubes U into S(n) with load one. Hypercubes in U have from [n/2] to (n+1)·[log₂ n]-2([lod₂n]+1)+2 dimensions, i.e., they can be as large as any hypercube which can be embedded with dilation at most four into S(n). We show that U can be embedded into S(n) with optimal expansion, which contrasts with the growing expansion ratios of previously known techniques. We employ several performance metrics to show that, with our techniques, a star graph can efficiently execute heterogeneous workloads containing hypercube, mesh, and star graph algorithms. The characterization of our packings includes some important metrics which have not been addressed by previous research (namely, average dilation, average congestion, and congestion). Our packings consistently produce small average congestion and average dilation, which indicates that the induced communication slowdown is also small. We consider several combinations of node mapping functions and routing algorithms in S(n), and obtain their corresponding performance metrics using either mathematical analysis or computer simulation 相似文献

3.

An optimal broadcasting algorithm without message redundancy instar graphs

Jang-Ping Sheu Chao-Tsung Wu Tzung-Shi Chen 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(6):653-658

Based on the V.E. Mendia and D. Sarkar's algorithm (1992), we propose an optimal and nonredundant distributed broadcasting algorithm in star graphs. For an n-dimensional star graph, our algorithm takes O(n log₂ n) time and guarantees that all nodes in the star graph receive the message exactly once. Moreover, broadcasting m packets in a pipeline fashion takes O(m log₂ n+n log₂ n) time due to the nonredundant property of our broadcasting algorithm 相似文献

4.

A family of parallel prefix algorithms embedded in networks

Takesue M. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(10):1179-1184

This paper presents a family of algorithms for producing, from (υ₀, υ₁, ..., υ_n-1), all initial prefixes x_i=υ₀&thetas;υ₁&thetas;···&thetas;υ_i (i=0, 1, ..., n-1) in parallel in interconnection networks such as the omega network and the hypercube, where &thetas; is an associative binary operator. Each algorithm can be embedded in the switches and interconnections of the network, and can be executed in O((log₂ r+1) log_r n) time steps provided that the network connecting n processors is constructed by using an r×r switch, and that parallelism within as well as among individual switches is exploited. The objective of these algorithms is to attain a communication pattern that fits the topology of the network. One type of network can be made equivalent to, or can be embedded in, another type of network, so a family of algorithms can be derived from one basic algorithm. In the basic algorithm, every processor p_i upward multicasts υ_i to processors p_k (k=i+1, i+2, ..., n - 1). En route to p_i, υ_j (j=0, 1, ..., i - 1) are combined in the switches to produce the (i - 1)th initial prefix x_i-1 that is received by p_i, which can then compute the ith initial prefix x_i=x_i-1&thetas;υ_i 相似文献

5.

Near-optimal broadcast in all-port wormhole-routed hypercubes usingerror-correcting codes

Ko H. Latifi S. Srimani P.K. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(3):247-260

A new broadcasting method is presented for hypercubes with wormhole routing mechanism. The communication model assumed allows an n-dimensional hypercube to have at most n concurrent 110 communications along its ports. It further assumes a distance insensitivity of (n+1) with no intermediate reception capability for the nodes along the communication path. The approach is based on determination of the set of nodes (called stations) in the hypercube such that for any node in the network there is a station at distance of at most 1. Once stations are identified, parallel disjoint paths are formed from the source to all stations. The broadcasting is accomplished first by sending the message to all stations which will in turn inform the rest of the nodes of the message. To establish node-disjoint paths between the source node and all stations, we introduce a new routing strategy. We prove that multicasting can be done in one routing step as long as the number of destination nodes are at most n in an n-dimensional hypercube. The number of broadcasting steps using our routing is equal to or smaller than that obtained in an earlier work; this number is optimal for all hypercube dimensions n⩽12, except for n=10 相似文献

6.

Extended Fibonacci Cubes 总被引：1，自引：0，他引：1

Jie Wu 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(12):1203-1210

The Fibonacci Cube is an interconnection network that possesses many desirable properties that are important in network design and application. The Fibonacci Cube can efficiently emulate many hypercube algorithms and uses fewer links than the comparable hypercube, while its size does not increase as fast as the hypercube. However, most Fibonacci Cubes (more than 2/3 of all) are not Hamiltonian. In this paper, we propose an Extended Fibonacci Cube (EFC₁) with an even number of nodes. It is defined based on the same sequence F(i)=F(i-1)+F(i-2) as the regular Fibonacci sequence; however, its initial conditions are different. We show that the Extended Fibonacci Cube includes the Fibonacci Cube as a subgraph and maintains its sparsity property. In addition, it is Hamiltonian and is better in emulating other topologies. Specifically, the Extended Fibonacci Cube can embed binary trees more efficiently than the regular Fibonacci Cube and is almost as efficient as the hypercube, even though the Extended Fibonacci Cube is a much sparser network than the hypercube. We also propose a series of Extended Fibonacci Cubes with even number of nodes. Any Extended Fibonacci Cube (EFC_k, with k⩾) in the series contains the node set of any other cube that precedes EFC_k in the series. We show that any Extended Fibonacci Cube maintains virtually all the desirable properties of the Fibonacci Cube. The EFC_ks can be considered as flexible versions of incomplete hypercubes, which eliminates their restriction on the number of nodes, and, thus, makes it possible to construct parallel machines with arbitrary sizes 相似文献

7.

Embedding and reconfiguration of binary trees in faulty hypercubes

Pei-Ji Yang Raghavendra C.S. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(3):237-245

We consider the problem of embedding and reconfiguring binary tree structures in faulty hypercubes. We assume that the number of faulty nodes is at most (n-2), where n is the number of dimensions of the hypercube; we further assume that the location of faulty nodes are known. Our embedding techniques are based on a key concept called free dimension, which can be used to partition a cube into subcubes such that each subcube contains at most one faulty node. Using this approach, two distributed schemes are provided for embedding and reconfiguration in faulty hypercubes. We extend the free dimension concept to degree of occupancy and use this to develop a distributed scheme for reconfiguration of binary tree in faulty hypercubes with up to [3n/2] node faults 相似文献

8.

Unicast-based multicast communication in wormhole-routed networks

McKinley P.K. Xu H. Esfahanian A.-H. Ni L.M. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(12):1252-1265

Multicast communication, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is being increasingly demanded in parallel computing. System supported multicast services can potentially offer improved performance, increased functionality, and simplified programming, and may in turn be used to support various higher-level operations for data movement and global process control. This paper presents efficient algorithms to implement multicast communication in wormhole-routed direct networks, in the absence of hardware multicast support, by exploiting the properties of the switching technology. Minimum-time multicast algorithms are presented for n-dimensional meshes and hypercubes that use deterministic, dimension-ordered routing of unicast messages. Both algorithms can deliver a multicast message to m-1 destinations in [log ₂ m] message passing steps, while avoiding contention among the constituent unicast messages. Performance results of implementations on a 64-node nCUBE-2 hypercube and a 168-node Symult 2010 2-D mesh are given 相似文献

9.

Folded Petersen cube networks: new competitors for the hypercubes

《Parallel and Distributed Systems, IEEE Transactions on》1996,7(2):151-168

We introduce and analyze a new interconnection topology, called the k-dimensional folded Petersen (FP_k) network, which is constructed by iteratively applying the Cartesian product operation on the well-known Petersen graph. Since the number of nodes in FP_k is restricted to a power of ten, for better scalability we propose a generalization, the folded Petersen cube network FPQ_n,k=Q_n×FP_k, which is a product of the n-dimensional binary hypercube (Q_n) and FP_k. The FPQ_n,k topology provides regularity, node- and edge-symmetry, optimal connectivity (and therefore maximal fault-tolerance), logarithmic diameter, modularity, and permits simple self-routing and broadcasting algorithms. With the same node-degree and connectivity, FPQ _n,k has smaller diameter and accommodates more nodes than Q _n+3k, and its packing density is higher compared to several other product networks. This paper also emphasizes the versatility of the folded Petersen cube networks as a multicomputer interconnection topology by providing embeddings of many computationally important structures such as rings, multi-dimensional meshes, hypercubes, complete binary trees, tree machines, meshes of trees, and pyramids. The dilation and edge-congestion of all such embeddings are at most two 相似文献

10.

Fault-tolerant ring embedding in a star graph with both link andnode failures

Yu-Chee Tseng Shu-Hui Chang Jang-Ping Sheu 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(12):1185-1195

The star graph interconnection network has been recognized as an attractive alternative to the hypercube network. Previously, the star graph has been shown to contain a Hamiltonian cycle. In this paper, we consider an injured star graph with some faulty links and nodes. We show that even with f_e⩽n-3 faulty links, a Hamiltonian cycle still can be found in an n-star, and that with f_v⩽n-3 faulty nodes, a ring containing at most 4f_v nodes less than that in a Hamiltonian cycle can be found (i.e. the ring contains at least n!-4f_v nodes). In general, in an n-star with f_e faulty links and f_v faulty nodes, where f_e+f_v⩽n-3, our embedding is able to establish a ring containing at least n!-4f_v nodes 相似文献

11.

超级Mbius立方体──一类最优容错的小直径互连网络

樊建席管殿柱《计算机研究与发展》1999,(3)

文中将具有２ｎ个顶点的Ｍｏｂｉｕｓ立方体的拓扑结构加以改变,得到了包含任意个顶点的互连网络——超级Ｍｏｂｉｕｓ立方体,并证明它保持了Ｍｏｂｉｕｓ立方体的高连通度、对数级的直径和顶点度数等优良性质,并且当顶点个数Ｎ＝２ｎ＋２ｎ－１时,０－型超级Ｍｏｂｉｕｓ立方体是一个（ｎ＋１）－正则图;更进一步地,由于它包含任意个顶点,所以其升级只需增加任意个顶点,从而克服了Ｍｏｂｉｕｓ立方体的升级必须成倍增加其顶点个数的缺点．相似文献

12.

The Hypercube as a Dynamically Reconfigurable Processor Mesh

《Journal of Parallel and Distributed Computing》1998,48(1):130-142

We describe a technique for the efficient processing of large, multidimensional arrays on a MIMD hypercube. The technique allows the hypercube to be used as a processor mesh whose relative dimension sizes may be changeddynamically, while always keeping adjacent array elements on the same node or on physically adjacent nodes. The technique is based on a mapping scheme, calledpermuted Gray code mapping, which is a generalization of the binary reflected Gray code mapping. We also extend the technique to allowinterleavingof the array data over the nodes of the hypercube. This technique can be used to efficiently parallelize scan-line algorithms, including operations such as volume rotation and volume rendering. 相似文献

13.

A multiway merge sorting network

De-Lei Lee Batcher K.E. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(2):211-215

A multiway merge sorting network is presented, which generalizes the technique used in the odd-even merge sorting network. The merging network described here is composed of m k-way mergers and a combining network. It arranges k ordered lists of length n each into one ordered lists in T(k)+[log₂k] [log₂m] [log₂m] steps, where T(k) is the number of steps needed to sort k keys in order; and k and m are any integers no longer restricted to 2 相似文献

14.

Collection-aware optimum sequencing of operations and closed-formsolutions for the distribution of a divisible load on arbitraryprocessor trees

Barlas G.D. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(5):429-441

The problem of optimally distributing a divisible load to the nodes of an arbitrary processor tree is tackled in this paper. The rigorous mathematical foundation presented allows the derivation of the sequence of operations that is necessary to obtain the minimum processing time, along with closed-form expressions that yield the solution in time O(NP), where P is the number of tree nodes and N their maximum degree. The main contributions of this work are: (1) both load distribution and result collection overheads are considered, thus providing better resource utilization, and (2) arbitrary processor trees are examined in contrast with previous approaches that examined either complete homogeneous trees, or single level trees. Additionally, approximate algorithms for solving the problem of specifying the optimum subset of active processors for a given load, are presented and evaluated 相似文献

15.

All nearest smaller values on the hypercube

Kravets D. Plaxton C.G. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(5):456-462

Given a sequence of n elements, the All Nearest Smaller Values (ANSV) problem is to find, for each element in the sequence, the nearest element to the left (right) that is smaller, or to report that no such element exists. Time and work optimal algorithms for this problem are known on all the PRAM models but the running time of the best previous hypercube algorithm is optimal only when the number of processors p satisfies 1⩽p⩽n/((lg³ n)(lg lg n)²). In this paper, we prove that any normal hypercube algorithm requires Ω(M) processors to solve the ANSV problem in O(lg n) time, and we present the first normal hypercube ANSV algorithm that is optimal for all values of n and p. We use our ANSV algorithm to give the first O(lg n)-time n-processor normal hypercube algorithms for triangulating a monotone polygon and for constructing a Cartesian tree 相似文献

16.

The exchanged hypercube 总被引：2，自引：0，他引：2

Loh P.K.K. Hsu W.J. Pan Y. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(9):866-874

This paper presents the exchanged hypercube, a new interconnection network obtained by systematically removing links from a binary hypercube. It maintains several desirable properties of the binary hypercube yet with reduced interconnection complexity. We also introduce the extended binomial tree, a spanning tree of the exchanged hypercube that preserves many desirable properties of the original binomial tree. A fault-tolerant routing strategy is also proposed for the exchanged hypercube. 相似文献

17.

Unicast in hypercubes with large number of faulty nodes 总被引：1，自引：0，他引：1

Qian-Ping Gu Shietung Peng 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(10):964-975

Unicast in computer/communication networks is a one-to-one communication between a source node s and a destination node t. We propose three algorithms which find a nonfaulty routing path between s and t for unicast in the hypercube with a large number of faulty nodes. Given the n-dimensional hypercube H_n and a set F of faulty nodes, node uϵ H_n is called k-safe if u has at least k nonfaulty neighbors. The H_n is called k-safe if every node of H_n is k-safe. It has been known that for 0⩽k⩽n/2, a k-safe H_n is connected if |F|⩽2^k(n-k)-1. Our first algorithm finds a nonfaulty path of length at most d(s,t)+4 in O(n) time for unicast between 1-safe s and t in the H_n with |F|⩽2n-3, where d(s,t) is the distance between s and t. The second algorithm finds a nonfaulty path of length at most d(s,t)+6 in O(n) time for unicast in the 2-safe H_n with |F|⩽4n-9. The third algorithm finds a nonfaulty path of length at most d(s,t)+O(k²) in time O(|F|+n) for unicast in the k-safe H_n with |F|⩽2^k(n-k)-1 (0⩽k⩽n/2). The time complexities of the algorithms are optimal. We show that in the worst case, the length of the nonfaulty path between s and t in a k-safe H_n with |F|⩽2^k(n-k)-1 is at least d(s,t)+2(k+1) for 0⩽k⩽n/2. This implies that the path lengths found by the algorithms for unicast in the 1-safe and 2-safe hypercubes are optimal 相似文献

18.

Optimized broadcasting and multicasting protocols in cut-throughrouted networks

Cohen J. Fraigniaud P. Konig J. Raspaud A. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(8):788-802

This paper addresses the one-to-all broadcasting problem and the one-to-many broadcasting problem, usually simply called broadcasting and multicasting, respectively. Broadcasting is the information dissemination problem in which a node of a network sends the same piece of information to all the other nodes. Multicasting is a partial broadcasting in the sense that only a subset of nodes forms the destination set. Both operations have many applications in parallel and distributed computing. In this paper, we study these problems in both line model, and cut-through model. The former assumes long distance calls between nonneighboring processors. The latter strengthens the line model by taking into account the use of a routing function. Long distance calls are possible in circuit-switched and wormhole-routed networks, and also in many networks supporting optical facilities. In the line model, it is well known that one can compute in polynomial time a [log₂n]-round broadcast or multicast protocol for any arbitrary network. Unfortunately such a protocol is often inefficient from a practical point of view because it does not use the resources of the network in a balanced way. In this paper, we present a new algorithm to compute broadcast or multicast protocols. This algorithm applies under both line and cut-through models. Moreover, it returns protocols that efficiently use the bandwidth of the network. From a complexity point of view, we also show that most of the optimization problems relative to the maximization of the efficiency of broadcast or multicast protocols in terms of switching time or vertex load are NP-complete. We have, however, derived polynomial efficient solutions for tree-networks 相似文献

19.

Computing Large Subcubes in Residual Hypercubes

Sridhar M. A. Raghavendra C. S. 《Journal of Parallel and Distributed Computing》1995,24(2)

相似文献

20.

Parallel branch and bound on fine-grained hypercube multiprocessors

Frank Dehne Afonso G. Ferreira Andrew Rau-Chaplin 《Parallel Computing》1990,15(1-3):201-209

In this paper, we study parallel branch and bound on fine grained hypercube multiprocessors. Each processor in a fine grained system has only a very small amount of memory available. Therefore, current parallel branch and bound methods for coarse grained systems ( 1000 nodes) cannot be applied, since all these methods assume that every processor stores the path from the node it is currently processing back to the node where the process was created (the back-up path). Furthermore, the much larger number of processors available in a fine grained system makes it imperative that global information (e.g. the current best solution) is continuously available at every processor; otherwise the amount of unnecessary search would become intolerable. We describe an efficient branch-and-bound algorithm for fine grained hypercube multiprocessors. Our method uses a global scheme where all processors collectively store all back-up paths such that each processor needs to store only a constant amount of information. At each iteration of the algorithm, all current nodes may decide whether they need to create new children, be pruned, or remain unchanged. We describe an algorithm that, based on these decisions, updates the current back-up paths and distributes global information in O(log m) steps, where m is the current number of nodes. This method also includes dynamic allocation of search processes to processors and provides optimal load balancing. Even if very drastic changes in the set of current nodes occur, our load balancing mechanism does not suffer any slow down. 相似文献