共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider the problem of routing packets on an MIMD mesh-connected array of processors augmented with row and column buses. We give lower bounds and randomized algorithms
for the problem of routing k-permutations (where each processor is the source and destination of exactly k packets) on a d-dimensional mesh with buses, which we call the (k,d)-routing problem.
We give a general class of ``hard' permutations which we use to prove lower bounds for the (k,d)-routing problem, for all k,d≥ 1. For the (1,1)- and (1,2)-routing problems the worst-case permutations from this class are identical to ones published by other authors, as are the
resulting lower bounds. However, we further show that the (1,d)-routing problem requires 0.72 ... n steps for d=3, 0.76 ... n steps for d=4, and slightly more than steps for all d≥ 5. We also obtain new lower bounds for the (k,d)-routing problem for k,d > 1, which improve on the bisection lower bound in some cases. These lower bounds hold for off-line routing as well.
We develop efficient algorithms for the (k,1)-routing problem and for the problem of routing k-randomizations (where each processor has k packets initially and each packet is routed to a random destination) on the one-dimensional mesh and use them in a general
(k,d)-routing algorithm which improves considerably on previous results. In particular, the routing time for the (1,d)-routing problem is bounded by steps with high probability (whp), whenever for some constant ε > 0, and the routing time for the (k,d)-routing problem is steps whp whenever for some constant ε > 0 and k≥ 3.6 ... d, matching the bisection lower bound.
We then present a simple algorithm for the (2,2)-routing problem running in 1.39 ... n+o(n) steps whp. Finally, for the important special case of routing permutations on two-dimensional meshes with buses, the (1,2)-routing problem, we give a more sophisticated algorithm that runs in 0.78 ... n+o(n) steps whp.
Received May 18, 1994; revised June 23, 1995. 相似文献
2.
3.
Mike Barnett David G. Payne Robert A. van de Geijn Jerrell Watts 《Journal of Parallel and Distributed Computing》1996,35(2):111
We address the problem of broadcasting on two-dimensional mesh architectures with an arbitrary (non-power-of-two) number of nodes in each dimension. It is assumed that such mesh architectures employ cut-through or wormhole routing. The primary focus is on avoiding network conflicts in the various proposed algorithms. We give algorithms for performing a conflict-free minimum-spanning tree broadcast, a pipelined algorithm that is similar to Ho and Johnsson's EDST algorithm for hypercubes, and a novelscatter–collectapproach that is a natural choice for communication libraries due to its simplicity. Results obtained on the Intel Paragon system are included. 相似文献
4.
In this paper we present efficient algorithms for packet routing on the reconfigurable linear array and the reconfigurable
two-dimensional mesh. We introduce algorithms that are efficient in the worst case and algorithms that are better on average.
The time bounds presented are better than those achievable on the conventional mesh and previously known algorithms. We present
two variants of the reconfigurable mesh. In the first model, M
r
, the processors are attached to a reconfigurable bus, the individual edge connections being bidirectional. In the second
model, M
mr
, the processors are attached to two unidirectional buses. In this paper we present lower bounds and nearly matching upper
bounds for packet routing on these two models. As a consequence, we solve two of the open problems mentioned in [9].
Received August 17, 1998; revised November 3, 1999. 相似文献
5.
A fault-tolerant routing method that can tolerate solid faults using only two virtual channels is presented. The proposed
routing algorithm, called FT-Ecube, not only uses a fewer number of virtual channels but also tolerates f-chains in the meshes.
Furthermore, the proposed scheme misroutes messages both clockwise and counter clockwise directions to reduce channel contention
on f-rings. It is shown that the proposed algorithm is deadlock-free and livelock-free in meshes when it has nonoverlapping
multiple f-regions. Further, we conducted flit-level simulations to evaluate the performance of the proposed routing algorithm.
As our simulation results show, FT-Ecube tolerates multiple faulty blocks using only two virtual channels per physical channel,
and has good performance in terms of average latency.
This work is supported by the NSF grant MIP-9705738 相似文献
6.
通过建立一个n×n二维总线网络上的消息传递模型,分析了基于总线网络的确定寻径算法性能,得出了任何基于n×n二维总线网络的确定寻径算法都至少需要1.5n步的结论。并由此推广到多维总线网络,得出结点总数为N的δ维总线网络上的确定寻径算法需要步。 相似文献
7.
《Journal of Parallel and Distributed Computing》2000,60(2):137-149
An optimal ⌈1.5N1/2⌉ lower bound is shown for oblivious routing on the mesh of buses: a two-dimensional parallel model consisting of N1/2×N1/2 processors and N1/2 row and N1/2 column buses but no local connections between neighboring processors. Many lower bound proofs for routing on mesh-structured models use a single instance (adversary) which includes difficult packet-movement. This approach does not work in our case; our proof is one of the rare cases which really exploit the fact that the routing algorithm has to cope with many different instances. Note that the two-dimensional mesh of buses includes 2N1/2 buses and each processor can access two different buses. Apparently the three-dimensional model provides more communication facilities, namely including 3N2/3 buses, and each processor can access three different buses. Surprisingly, however, the oblivious routing on the three-dimensional mesh of buses needs more time, i.e., Ω(N2/3) steps, which is another important result of this paper. 相似文献
8.
9.
Ji-Peng Zhou 《计算机科学技术学报》2005,20(6):822-830
In wormhole meshes, a reliable routing is supposed to be deadlock-free and fault-tolerant. Many routing algorithms are able to tolerate a large number of faults enclosed by rectangular blocks or special convex, none of them, however, is capable of handling two convex fault regions with distance two by using only two virtual networks. In this paper, a fault-tolerant wormhole routing algorithm is presented to tolerate the disjointed convex faulty regions with distance two or no less, which do not contain any nonfaulty nodes and do not prohibit any routing as long as nodes outside faulty regions are connected in the mesh network. The processors' overlapping along the boundaries of different fault regions is allowed. The proposed algorithm, which routes the messages by X-Y routing algorithm in fault-free region, can tolerate convex fault-connected regions with only two virtual channels per physical channel, and is deadlock- and livelock-free. The proposed algorithm can be easily extended to adaptive routing. 相似文献
10.
Barnett M. Littlefield R. Payne D. G. Vandegeijn R. 《Journal of Parallel and Distributed Computing》1995,24(2)
The problem of performing a global combine (summation) operation on a distributed memory computer using a two-dimensional mesh interconnect with wormhole routing is considered. We present algorithms that are asymptotically optimal for short vectors (O(log(p)) for p processing nodes) and for long vectors (O(n) for n data elements per node), as well as hybrid algorithms that are superior for intermediate n. The algorithms are analyzed using detailed performance models that include the effects of link conflicts and other characteristics of the underlying communication system. The models are validated using experimental data from the Intel Touchstone DELTA computer. We show that no one algorithm is optimal for all vector lengths; rather, each of the presented algorithms is superior under some circumstances. 相似文献
11.
《Journal of Parallel and Distributed Computing》1994,20(3):341-356
The critical problem in creating practical online SIMD mesh routing algorithms is to minimize both the number of communication steps and the size and complexity of the queues required at each PE (processing element). Currently, the best available algorithms for likely array sizes require 16n routing steps with queue size 1; if priority queues of size 2q − 1 are allowed, the number of routing steps required is reduced to 14n/q + 2n. We present an algorithm (the MGRA), based on wormhole routing, that has routed a large number of communication patterns (all patterns tried besides a synthetically constructed worst case) in 5n routing steps with a FIFO queue of size 2. We also show that the MGRA can be modified for meshes with broadcast buses and reconfigurable broadcast buses to route in a similar number of routing steps but with a queue size of 1. A second algorithm (the CGRA) uses reconfigurable broadcast buses in implementing cut-through routing. Using the CGRA, sparse patterns are routed in a small constant number of communication steps. We prove that the MGRA has bad worst case performance, but also show that a randomizing preprocessing step can improve the predictability of the original result. Finally, we show how performance scales with changing inter- and intra-PE path widths. 相似文献
12.
《Journal of Parallel and Distributed Computing》1995,27(1):56-70
A deadlock-free fully adaptive minimal routing algorithm for meshes that is optimal in the number of virtual channels required and in the number of restrictions placed on the use of these virtual channels is presented. It is also proved that, ignoring symmetry, this routing algorithm is the only fully adaptive routing algorithm with uniform routers that achieves both of these goals. This new algorithm requires only 4n-2 virtual channels for an n-dimensional mesh. In addition, if more than the minimum number of virtual channels is available, the routing algorithm can use these additional channels with the fewest possible number of restrictions. The proofs are first presented for the 2D mesh and then generalized to n-dimensional meshes. 相似文献
13.
《Computer Architecture Letters》2004,3(1):3-3
In this paper we present a methodology to design fault-tolerant routing algorithms for regular direct interconnection networks. It supports fully adaptive routing, does not degrade performance in the absence of faults, and supports a reasonably large number of faults without significantly degrading performance. The methodology is mainly based on the selection of an intermediate node (if needed) for each source-destination pair. Packets are adaptively routed to the intermediate node and, at this node, without being ejected, they are adaptively forwarded to their destinations. In order to allow deadlock-free minimal adaptive routing, the methodology requires only one additional virtual channel (for a total of three), even for tori. Evaluation results for a 4 x 4 x 4 torus network show that the methodology is 5-fault tolerant. Indeed, for up to 14 link failures, the percentage of fault combinations supported is higher than 99.96%. Additionally, network throughput degrades by less than 10% when injecting three random link faults without disabling any node. In contrast, a mechanism similar to the one proposed in the BlueGene/L, that disables some network planes, would strongly degrade network throughput by 79%. 相似文献
14.
15.
16.
当系统包含很少的故障点时.mesh/torus网整个系统就有可能是不可靠的.该文采用扩展的局部可靠性信息来指导三维mesh/torus网的容错路由.扩展的局部可靠性信息在每个平面内部对无故障节点分类,所以系统中的故障块也是在不同的平面上构成的,而不是基于整个系统.很多基于整个系统不可靠的节点在二维的平面中都会变成可靠的节点.不管是在可靠的系统内,甚或不可靠的系统内,扩展的局部可靠性信息都能有效地指导容错路由.不同于以往的方法,作者的方法不会将任何无故障节点设置为无效节点.所有的故障块都是在平面内构成的,而不是基于整个系统;在一个平面内.任何包含在故障块里的无故障节点仍然可作为出发点或者目标点,这样将大大提高系统的计算能力和性能.模拟结果表明该文方法大大优于已有的方法. 相似文献
17.
Selection on k-Dimensional Meshes with Multiple Broadcasting 总被引:1,自引:0,他引:1
18.
Bhagavathi D. Olariu S. Schwing J. L. Shen W. Wilson L. Zhang J. 《Journal of Parallel and Distributed Computing》1995,27(2)
Our contribution is twofold. First, we show that Ω(log n) is a time lower bound on the CREW-PRAM and the mesh with multiple broadcasting for the tasks of computing the perimeter, the area, the diameter, the width, the modality, the smallest-area enclosing rectangle, and the largest-area inscribed triangle of a convex n-gon. We show that the same time lower bound holds for the tasks of detecting whether a convex n-gon lies inside another as well as for computing the maximum distance between two convex n-gons. We obtain our time lower bound results for the CREW-PRAM by using a novel technique involving geometric constructions. These constructions allow us to reduce the well-known OR problem to each of the geometric problems of interest. We then port these time lower bounds to the mesh with multiple broadcasting using simulation results. Our second contribution is to show that the Ω(log n) time lower bound is tight by providing O(log n) time algorithms to solve these problems on a mesh with multiple broadcasting of size n × n. Finally, we show that for two separable convex n-gons P and Q, the task of computing the minimum distance between P and Q can be performed in O(1) time on a mesh with multiple broadcasting of size n × n. 相似文献
19.
20.
Amir Vaxman 《Computer Graphics Forum》2012,31(5):1647-1656
We offer a framework for editing and modeling of planar meshes, focusing on planar quad, and hexagonal‐dominant meshes, which are held in high demand in the field of architectural design. Our framework manipulates these meshes by affine maps that are assigned per‐face, and which naturally ensure the planarity of these faces throughout the process, resulting in a linear subspace of compatible planar deformations for any given mesh. Our modeling metaphors include classical handle‐based editing, mesh interpolation, and shape‐space exploration, all of which allow for an intuitive way to produce new polyhedral and near‐polyhedral meshes by editing. 相似文献