首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A theory for the design of deadlock-free adaptive routing algorithms for wormhole networks, proposed by the author (1991, 1993), supplies sufficient conditions for an adaptive routing algorithm to be deadlock-free, even when there are cyclic dependencies between channels. Also, two design methodologies were proposed. Multicast communication refers to the delivery of the same message from one source node to an arbitrary number of destination nodes. A tree-like routing scheme is not suitable for hardware-supported multicast in wormhole networks because it produces many headers for each message, drastically increasing the probability of a message being blocked. A path-based multicast routing model was proposed by Lin and Ni (1991) for multicomputers with 2D-mesh and hypercube topologies. In this model, messages are not replicated at intermediate nodes. This paper develops the theoretical background for the design of deadlock-free adaptive multicast routing algorithms. This theory is valid for wormhole networks using the path-based routing model. It is also valid when messages with a single destination and multiple destinations are mixed together. The new channel dependencies produced by messages with several destinations are studied. Also, two theorems are proposed, developing conditions to verify that an adaptive multicast routing algorithm is deadlock-free, even when there are cyclic dependencies between channels. As an example, the multicast routing algorithms of Lin and Ni are extended, so that they can take advantage of the alternative paths offered by the network  相似文献   

2.
The issues of adaptive multicast wormhole routing in 2D mesh multicomputers are studied. Three adaptive multicast wormhole routing strategies are proposed and evaluated. The methods include minimal partially adaptive, minimal fully adaptive, and nonminimal adaptive routing. All the algorithms are shown to be deadlock-free. A study has been conducted that compares the performance of these multicast algorithms. The results show that the minimal fully adaptive routing method creates the least traffic; however, double vertical channels are required in order to avoid deadlock. The nonminimal routing algorithm exhibits the best adaptivity, although it creates more network traffic than the other methods.  相似文献   

3.
A mesh network is a popular architecture which has been implemented in many multicomputer systems. It is preferred because it offers useful edge connectivity and is partitioned into units that are still meshes. It is also scalable and has a number of features that make it particularly amenable to high-performance computing. The 2-D mesh topology has become increasingly important to multicomputer design because of its many desirable properties including scalability, low bandwidth and fixed degree of nodes.The essential pattern in new multicomputer generations is the multicast wormhole pattern, which corresponds to one-to-many communication in which one source sends the same message to multiple destination nodes. In wormhole routing, a message is decomposed into words or flits, and flits of one message may be spread out among several nodes. Deadlock in the interconnection network occurs when no message can advance towards its destination. Some deadlock-free routing algorithms for wormhole routing were proposed, but the network latency and the network traffic were not taken into consideration. An optimal message routing should achieve both minimum traffic and minimum latency for the communication patterns involved. Unfortunately, finding optimal message routing has been shown to be NP-hard for most common multicomputer topologies.In this paper, an efficient algorithm (YOMNA) is introduced to find a deadlock-free multicast wormhole routing in 2-D mesh parallel machines. YOMNA algorithm is a tree-based technique, in which the router simultaneously sends incoming flits on more than one outgoing channel. YOMNA algorithm is compared with the dual-path multicast routing, which is a path-based technique. YOMNA algorithm has proved to be deadlock free. The network latency and the network traffic are calculated for YOMNA algorithm and for the dual-path multicast routing. The results demonstrate that YOMNA algorithm outperformed the dual-path routing.  相似文献   

4.
Multicast communication, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is being increasingly demanded in parallel computing. System supported multicast services can potentially offer improved performance, increased functionality, and simplified programming, and may in turn be used to support various higher-level operations for data movement and global process control. This paper presents efficient algorithms to implement multicast communication in wormhole-routed direct networks, in the absence of hardware multicast support, by exploiting the properties of the switching technology. Minimum-time multicast algorithms are presented for n-dimensional meshes and hypercubes that use deterministic, dimension-ordered routing of unicast messages. Both algorithms can deliver a multicast message to m-1 destinations in [log 2 m] message passing steps, while avoiding contention among the constituent unicast messages. Performance results of implementations on a 64-node nCUBE-2 hypercube and a 168-node Symult 2010 2-D mesh are given  相似文献   

5.
We show that deadlocks due to dependencies on consumption channels are a fundamental problem in wormhole multicast routing. This type of resource deadlocks has not been addressed in many previously proposed wormhole multicast algorithms. We also show that deadlocks on consumption channels can be avoided by using multiple classes of consumption channels and restricting the use of consumption channels by multicast messages. We provide upper bounds for the number of consumption channels required to avoid deadlocks. In addition, we present a new multicast routing algorithm, column-path, which is based on the well-known dimension-order routing used in many multicomputers and multiprocessors. Therefore, this algorithm could be implemented in existing multicomputers with simple changes to the hardware. Using simulations, we compare the performance of the proposed column-path algorithm with the previously proposed Hamiltonian-path-based multipath and an e-cube-based multicast routing algorithms. Our results show that for multicast traffic, the column-path routing offers higher throughputs, while the multipath algorithm offers lower message latencies. Another result of our study is that the commonly implemented simplistic scheme of sending one copy of a multicast message to each of its destinations exhibits good performance provided the number of destinations is small  相似文献   

6.
Efficient routing of messages is a key to the performance of multicomputers. Multicast communication refers to the delivery of the same message from a source node to an arbitrary number of destination nodes. While multicast communication is highly demanded in many applications, most of the existing multicomputers do not directly support this service; rather it is indirectly supported by multiple one-to-one or broadcast communications, which result in more network traffic and a waste of system resources. The authors study routing evaluation criteria for multicast communication under different switching technologies. Multicast communication in multicomputers is formulated as a graph theoretical problem. Depending on the evaluation criteria and switching technologies, they study three optimal multicast communication problems, which are equivalent to the finding of the following three subgraphs: optimal multicast path, optimal multicast cycle, and minimal Steiner tree, where the interconnection of a multicomputer defines a host graph. They show that all these optimization problems are NP-complete for the popular 2D-mesh and hypercube host graphs. Heuristic multicast algorithms for these routing problems are proposed  相似文献   

7.
Multistage interconnection networks are a popular class of interconnection architecture for constructing scalable parallel computers (SPCs). The focus of this paper is on the multistage network system which supports wormhole routed turnaround routing. Existing machines characterized by such a system model include the IBM SP-1 and SP-2, TMC CM-5, and Meiko CS-2. Efficient collective communication among processor nodes is critical to the performance of SPCs. A system-level multicast service, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is fundamental in supporting collective communication primitives including the application-level broadcast, reduction, and barrier synchronization. This paper addresses how to efficiently implement multicast services in wormhole-routed multistage networks, in the absence of hardware multicast support, by exploiting the properties of the turnaround switching technology. An optimal multicast algorithm is proposed. The results of implementations on a 64-node SP-1 show that the proposed algorithm significantly outperforms the application-level broadcast primitives provided by currently existing collective communication libraries including the public domain MPI  相似文献   

8.
This paper addresses the problem of fault-tolerant multicast routing in wormholerouted multicomputers.A new pseudo-cycle-based routing method is presented for constructing deadlock-free multicast routing algorithms.With at most two virtual channels this technique can be applied to any connected networks with arbitrary topologies.Simulation results show that this technique results in negligible performance degradation even in the presence of a large number of faulty nodes.  相似文献   

9.
This paper presents a framework to design fully-adaptive, deadlock-free wormhole algorithms for a variety of network topologies. The main theoretical contributions are: (a) design of new wormhole algorithms using store-and-forward algorithms, (b) a sufficient condition for deadlock free routing by the wormhole algorithms so designed, and (c) a sufficient condition for deadlock free routing by these wormhole algorithms with centralized flit buffers shared among multiple channels. To illustrate the theory, several wormhole algorithms based on store-and-forward hop schemes are designed. The hop-based wormhole algorithms can be applied to a variety of networks including torus, mesh, de Brujin, and a class of Cayley networks, with the best known bounds on virtual channels for minimal routing on the last two classes of networks. An analysis of the resource requirements and performances of a proposed algorithm, called negative-hop algorithm, with some of the previously proposed algorithms for torus and mesh networks is presented  相似文献   

10.
In wormhole meshes, a reliable routing is supposed to be deadlock-free and fault-tolerant. Many routing algorithms are able to tolerate a large number of faults enclosed by rectangular blocks or special convex, none of them, however, is capable of handling two convex fault regions with distance two by using only two virtual networks. In this paper, a fault-tolerant wormhole routing algorithm is presented to tolerate the disjointed convex faulty regions with distance two or no less, which do not contain any nonfaulty nodes and do not prohibit any routing as long as nodes outside faulty regions are connected in the mesh network. The processors' overlapping along the boundaries of different fault regions is allowed. The proposed algorithm, which routes the messages by X-Y routing algorithm in fault-free region, can tolerate convex fault-connected regions with only two virtual channels per physical channel, and is deadlock- and livelock-free. The proposed algorithm can be easily extended to adaptive routing.  相似文献   

11.
This paper addresses the problem of one-to-many, or multicast, communication in wormhole-routed,n-dimensional torus networks. The proposed methods are designed for systems that support intermediate reception, which permits multidestination messages to be pipelined through several nodes, depositing a copy at each node. A key issue in the design of such systems is the routing function, which must support both unicast and multicast traffic while preventing deadlock among messages. An efficient, deadlock-free routing function is developed and used as a basis for a family of multicast algorithms. TheS-torusmulticast algorithm uses a single multidestination message to perform an arbitrary multicast operation. TheM-torusalgorithm is a generalized multiphase multicast algorithm, in which a combination of multidestination messages is used to perform a multicast in one or more communication steps. Two specific instances of the M-torus algorithm, theMd-torusandMu-torusmulticast algorithms, are presented. These algorithms produce contention-free multicast operations and are deadlock-free under all combinations of network traffic. A simulation study compares the performance of the different multicast algorithms, and implementation issues are discussed. The results of this research are applicable to the design of architectures for both wormhole-routed massively parallel computers and high-speed local area networks with wormhole-routed switch fabrics.  相似文献   

12.
Both adaptive unicast routing and efficient multicast communication have been shown to be important to the performance of distributed-memory multiprocessors, or multicomputers. In this paper, we propose a uniform adaptive routing strategy for wormhole-routed hypercube networks that accommodates both unicast and multicast communication. Based on a node labeling method, the resultant routing algorithms are shown to be deadlock-free without requiring virtual channels. We present an optimal ordering algorithm that minimizes the traffic generated under the proposed paradigm. A greedy algorithm with less time complexity is also proposed.  相似文献   

13.
Most machines of the last generation of distributed memory parallel computers possess specific routers which are used to exchange messages between nonneighboring nodes in the network. Among the several technologies, wormhole routing is usually preferred because it allows low channel-setup time and reduces the dependency between latency and internode distance. However, wormhole routing is very susceptible to deadlock because messages are allowed to hold many resources while requesting others. Therefore, designing deadlock-free routing algorithms using few hardware facilities is a major problem for wormhole-routed networks. In this paper, we describe a general theoretical framework for the study of deadlock-free routing functions. We give a general definition of what can be a routing function. This definition captures many specific definitions of the literature (e.g., vertex dependent, input-dependent, source-dependent, path-dependent etc.). Using our definition, we give a necessary and sufficient condition which characterizes deadlock-free routing functions. Our theory embraces, at a high level, most of the theories related to deadlock avoidance in wormhole-routed networks previously derived in the literature. In particular, it applies not only to one-to-one routing, but also to one-to-many routing. The latter paradigm is used to solve the multicast problem with the path-based or tree-based facility  相似文献   

14.
M.G.  A.A.  M.A.  K.   《Journal of Systems Architecture》2008,54(10):919-928
A torus network has become increasingly important to multicomputer design because of its many features including scalability, low bandwidth and fixed degree of nodes. A multicast communication is a significant operation in multicomputer systems and can be used to support several other collective communication operations. This paper presents an efficient algorithm, TTPM, to find a deadlock-free multicast wormhole routing in two-dimensional torus parallel machines. The introduced algorithm is designed such that messages can be sent to any number of destinations within two start-up communication phases; hence the name Torus Two Phase Multicast (TTPM) algorithm. An efficient routing function is developed and used as a basis for the introduced algorithm. Also, TTPM allows some intermediate nodes that are not in the destination set to perform multicast functions. This feature allows flexibility in multicast path selection and therefore improves the performance. Performance results of a simulation study on torus networks are discussed to compare TTPM algorithm with a previous algorithm.  相似文献   

15.
We address the problem of broadcasting on two-dimensional mesh architectures with an arbitrary (non-power-of-two) number of nodes in each dimension. It is assumed that such mesh architectures employ cut-through or wormhole routing. The primary focus is on avoiding network conflicts in the various proposed algorithms. We give algorithms for performing a conflict-free minimum-spanning tree broadcast, a pipelined algorithm that is similar to Ho and Johnsson's EDST algorithm for hypercubes, and a novelscatter–collectapproach that is a natural choice for communication libraries due to its simplicity. Results obtained on the Intel Paragon system are included.  相似文献   

16.
江松  郑世荣 《计算机学报》1997,20(3):223-229
如何获得死锁而且通信性能良好的选路算法始终是人们十分关心的问题。本文提出了纯分流点的概念并证明了选路算法无死锁的充要条件,从理论上解决了死锁关系问题,为死锁的判定,消除和设计无死锁的算法提供了有力的依据。  相似文献   

17.
This paper presents a general methodology for generating deadlock-free routing algorithms for irregular networks. Constructing a spanning tree on the given network, assigning directions to the network channels, creating deadlock-free zones, and specifying a logical sequence of the produced deadlock-free zones are the four fundamental steps that the proposed methodology takes to generate deadlock-free and connected routing algorithms. By applying the proposed methodology with two known labeling methods we have generated six irregular routing algorithms: three of them are novel routing algorithms and three of them (the Up/Down, Left/Right, and L-turn routing algorithms) have already been proposed in the literature. Extensive simulation experiments have been performed considering various network topologies, different network sizes (considering different network nodes and network channels), various message lengths, a variety of spanning tree roots, and a wide range of message (traffic) generation rates. Simulation results show that the six routing algorithms can be divided into three pairs. Routing members of each pair show similar behavior in terms of message latencies and saturation generation rates. However, it is worth noting that for a given topology the performance of the six routing algorithms may be totally different and it mainly depends on the network topology.  相似文献   

18.
Fault-tolerant message routing mechanism is a key to the performance of reliable multicomputers. Multicast refers to the delivery of the same message from a source node to an arbitrary number of destination nodes. This paper presents two types of partially adaptive fault tolerant multicast algorithms. The Type A algorithm can deliver messages to all destinations through shortest paths if each fault-free node has at most one faulty neighbor. The Type B algorithm can deliver messages to all destinations if the total number of faulty links and faulty nodes is less than the dimension of the hypercube. The proposed algorithms have the following important features: they are distributed, they only require local information to determine the paths, and they need very little additional message overhead. The performance of the algorithms, measured by the traffic created by the communication, is very close to that in fault-free hypercubes.  相似文献   

19.
In multicomputers that utilize circuit switching or wormhole routing, communication overhead depends largely on link contention-the variation due to distance between nodes is negligible. This has a major impact on the load balancing problem. In this case there are some nodes with an excess load (sources) and other with a deficit load (sinks). A matching of sources to sinks is required to avoid contention. The problem is made complex by the hardwired routing on currently available machines: The user can control only which nodes communicate but not how the messages are routed. Network flow models of message flow in the mesh and the hypercube have been developed to solve this problem. The crucial property of these models is the correspondence between minimum cost flows and correctly routed messages. To solve a given load balancing problem, a minimum cost flow algorithm is applied to the network. This permits the efficient determination of a maximum contention free matching of sources to sinks that, in turn, tells how much of the given imbalance can be eliminated without contention  相似文献   

20.
多播路由算法对互连网络的通信性能和多处理机系统性能的发挥起着重要作用。针对基三分层互连网络,在权衡性能、成本和实现的基础上,提出一种基于树的受限多播路由算法TRMA。该算法充分利用基三分层互连网络的层次特性和节点编码中所含的网络拓扑信息实现消息路由,算法设计简单,易于硬件实现。和其他基于树的多播路由算法相比,TRMA算法不需要源节点在发送消息前构建多播树,并将多播树的信息存放在消息中,大大降低了源节点的工作负载,提高整个系统的性能。通过仿真比较了TRMA和基于单播的多播路由算法,结果表明TRMA具有较低的网络延迟和较小的网络流量。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号