首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《Performance Evaluation》2001,43(2-3):165-179
Many fully adaptive algorithms have been proposed in the literature over the past decade. The performance characteristics of most of these algorithms have been analysed by means of software simulation only. This paper proposes an analytical model to predict message latency in wormhole-routed k-ary n-cubes with fully adaptive routing. The analysis focuses Duato’s fully adaptive routing algorithm [IEEE Trans. Parall. Distrib. Syst. 4 (2) (1993) 320], which is widely accepted as the most general algorithm for achieving adaptivity in wormhole-routed networks while allowing for an efficient router implementation. The proposed model is general in that it exhibits a good degree of accuracy for various network sizes and under different operating conditions.  相似文献   

2.
This paper identifies performance degradation in wormhole routed k-ary n-cube networks due to limited number of router-to-processor consumption channels at each node. Many recent research in wormhole routing have advocated the advantages of adaptive routing and virtual channel flow control schemes to deliver better network performance. This paper indicates that the advantages associated with these schemes cannot be realized with limited consumption capacity. To alleviate such performance bottlenecks, a new network interface design using multiple consumption channels is proposed. To match virtual multiplexing on network channels, we also propose each consumption channel to support multiple virtual consumption channels. The impact of message arrival rate at a node on the required number of consumption channels is studied analytically. It is shown that wormhole networks with higher routing adaptivity, dimensionality, degree of hot-spot traffic, and number of virtual lanes have to take advantage of multiple consumption channels to deliver better performance. The interplay between system topology, routing algorithm, number of virtual lanes, messaging overheads, and communication traffic is studied through simulation to derive the effective number of consumption channels required in a system. Using the ongoing technological trend, it is shown that wormhole-routed systems can use up to two-four consumption channels per node to deliver better system performance  相似文献   

3.
Virtual channels yield significant improvement in the performance of wormhole-routed networks as they can greatly reduce message blocking over network resources. K-ary n-cubes with deterministic routing have been widely analysed using analytical modelling tools. Most existing models, however, have either entirely ignored the effects of virtual channel multiplexing or have not considered the impact of virtual channels allocation on message latency. This paper discusses two different organisations of virtual channels in k-ary n-cubes, resulting in two deterministic routing algorithms. It then proposes an analytical model to compute message latency for the two routing algorithms. The proposed model is used in a case study to demonstrate the sensitivity of network latency to the way virtual channels are allocated to messages.  相似文献   

4.
Adaptive routing and virtual channels are used to increase routing adaptivity in wormhole-routed two-dimensional meshes. But increasing channel buffer utilization without considering even distribution of the traffic loads tends to cause congestion in the most adaptive routing area. To avoid such traffic congestion, a concept of the restricted area is proposed. The proposed restricted area, defined to be a part of the network where message transmission concentrates, can be located following the region of adaptivity. By properly guiding message routing inside and outside the area, we are able to achieve more balanced buffer utilization and to reduce traffic congestion accordingly. The performance of several routing algorithms with or without using the restricted area is simulated and evaluated under various traffic loads and distribution patterns. The results indicate that routing algorithms with the restricted areas yield constantly larger throughput and smaller latency than routing algorithms without using the concept.  相似文献   

5.
Analytical models of fully adaptive routing for common wormhole-routed networks (e.g., hypercubes) under the uniform traffic pattern have recently been reported in the literature. However, many studies have revealed that the performance advantages of adaptive routing over deterministic routing is more noticeable when the traffic is nonuniform due to, for example, the existence of hot spots in the network. This paper proposes a new queueing model of fully adaptive routing in the hypercube in the presence of hot spot traffic. The analysis focuses on Duato's algorithm, but can easily be applied to other fully adaptive routing algorithms. Results from simulation experiments are presented to validate the model  相似文献   

6.
In this paper, we introduce a new approach to deadlock-free routing in wormhole-routed networks called the message flow model. This method may be used to develop deterministic, partially-adaptive, and fully-adaptive routing algorithms for wormhole-routed networks with arbitrary topologies. We first establish the necessary and sufficient condition for deadlock free routing, based on the analysis of the message flow on each channel. We then use the model to develop new adaptive routing algorithms for 2D meshes  相似文献   

7.
This paper addresses the problem of one-to-many, or multicast, communication in wormhole-routed,n-dimensional torus networks. The proposed methods are designed for systems that support intermediate reception, which permits multidestination messages to be pipelined through several nodes, depositing a copy at each node. A key issue in the design of such systems is the routing function, which must support both unicast and multicast traffic while preventing deadlock among messages. An efficient, deadlock-free routing function is developed and used as a basis for a family of multicast algorithms. TheS-torusmulticast algorithm uses a single multidestination message to perform an arbitrary multicast operation. TheM-torusalgorithm is a generalized multiphase multicast algorithm, in which a combination of multidestination messages is used to perform a multicast in one or more communication steps. Two specific instances of the M-torus algorithm, theMd-torusandMu-torusmulticast algorithms, are presented. These algorithms produce contention-free multicast operations and are deadlock-free under all combinations of network traffic. A simulation study compares the performance of the different multicast algorithms, and implementation issues are discussed. The results of this research are applicable to the design of architectures for both wormhole-routed massively parallel computers and high-speed local area networks with wormhole-routed switch fabrics.  相似文献   

8.
Several recent studies have shown that adaptive routing algorithms based on deadlock recovery have superior performance characteristics than those based on deadlock avoidance. Most of these studies, however, have relied on software simulation due to the lack of analytical modelling tools. In an effort towards filling this gap, this paper presents a new analytical model of compressionless routing in wormhole-routed hypercubes. This routing algorithm exploits the tight coupling between wormhole routers for flow control to detect and recover from potential deadlock situations. The advantages of compressionless routing include deadlock-free adaptive routing with no extra virtual channels, simple router design, and order-preserving message transmission. The proposed analytical model computes message latency by determining the message transmission time, blocking delay at each router, multiplexing delay at each network channel, and waiting time in the source before entering the network. The validity of the model is demonstrated by comparing analytical results with those obtained through simulation experiments.  相似文献   

9.
Many performance models for deterministic routing in multicomputer interconnection networks have been derived and analyzed under the assumption of the traditional Poisson stochastic arrival process, which is inherently unable to capture traffic self-similarity revealed by many real-world parallel applications. In an effort towards understanding the network performance under various traffic loads and different design alternatives, this paper presents an analytical model for dimension-ordered routing in k-ary n-cubes when subjected to self-similar traffic. As the service time, blocking probability and waiting time experienced by a message vary from a dimension to another, the design of such a model for dimension-ordered routing poses greater challenges. The developed analytical model is then used to investigate the efficiency of two different ways to organize virtual channels for deterministic routing and to evaluate the impact of self-similar traffic with various Hurst parameters on network performance.  相似文献   

10.
《Performance Evaluation》2006,63(4-5):423-440
Several analytical models of fully adaptive routing in wormhole-routed networks have recently been reported in the literature. All these models, however, have been discussed for routing algorithms with deadlock avoidance. Recent studies have revealed that deadlocks are quite rare in the network, especially when enough routing freedom is provided. Thus, the hardware resources, e.g. virtual channels, dedicated for deadlock avoidance are not utilised most of the time. This consideration has motivated researchers to introduce fully adaptive routing algorithms with deadlock-recovery. This paper proposes a new analytical model to predict message latency in k-ary n-cubes with compressionless routing, a fully adaptive algorithm that uses deadlock-recovery. The proposed model uses results from queueing systems with impatient customers to capture the effects of the timeout mechanism used in this routing algorithm to deal with message deadlock. The validity of the model is demonstrated by comparing results predicted by the analytical model against those obtained through simulation experiments.  相似文献   

11.
This paper develops detailed analytical performance models for k-ary n-cube networks with single-hit or infinite buffers, wormhole routing, and the nonadaptive deadlock-free routing scheme proposed by Dally and Seitz (1987). In contrast to previous performance studies of such networks, the system is modeled as a closed queueing network that: includes the effects of blocking and pipelining of messages in the network; allows for arbitrary source-destination probability distributions; and explicitly models the virtual channels used in the deadlock-free routing algorithm. The models are used to examine several performance issues for 2-D networks with shared-memory traffic. These results should prove useful for engineering high-performance systems based on low-dimensional k-ary n-cube networks  相似文献   

12.
The capability of multidestination wormhole allows a message to be propagated along any valid path in a wormhole-routed network conforming to the underlying base routing scheme. The multicast on the path-based routing model is highly dependent on the spatial locality of destinations participating in multicasting. In this paper, we propose two proximity grouping schemes for efficient multicast in wormhole-routed mesh networks with multidestination capability by exploiting the spatial locality of the destination set. The first grouping scheme, graph-based proximity grouping, is proposed to group the destinations together with locality to construct several disjoint sub-meshes. This is achieved by modeling the proximity grouping problem to graph partitioning problem. The second one, pattern-based proximity grouping, is proposed by the pattern classification schemes to achieve the goal of the proximity grouping. By simulation results, we show the routing performance gains over the traditional Hamiltonian-path routing scheme.  相似文献   

13.
Several researchers have analysed the performance of k-ary n-cubes taking into account channel bandwidth constraints imposed by implementation technology, namely the constant wiring density and pin-out constraints for VLSI and multiple-chip technology respectively. For instance, Dally [IEEE Trans. Comput. 39(6) (1990) 775], Abraham [Issues in the architecture of direct interconnection networks schemes for multiprocessors, Ph.D. thesis, University of Illinois at Urbana-Champaign, 1992], and Agrawal [IEEE Trans. Parallel Distributed Syst. 2(4) (1991) 398] have shown that low-dimensional k-ary n-cubes (known as tori) outperform their high-dimensional counterparts (known as hypercubes) under the constant wiring density constraint. However, Abraham and Agrawal have arrived at an opposite conclusion when they considered the constant pin-out constraint. Most of these analyses have assumed deterministic routing, where a message always uses the same network path between a given pair of nodes. More recent multicomputers have incorporated adaptive routing to improve performance. This paper re-examines the relative performance merits of the torus and hypercube in the context of adaptive routing. Our analysis reveals that the torus manages to exploit its wider channels under light traffic. As traffic increases, however, the hypercube can provide better performance than the torus. Our conclusion under the constant wiring density constraint is different from that of the works mentioned above because adaptive routing enables the hypercube to exploit its richer connectivity to reduce message blocking.  相似文献   

14.
Networks of workstations are rapidly emerging as a cost-effective alternative to parallel computers. Switch-based interconnects with irregular topology allow the wiring flexibility, scalability, and incremental expansion capability required in this environment. However, the irregularity also makes routing and deadlock avoidance on such systems quite complicated. In current proposals, many messages are routed following nonminimal paths, increasing latency and wasting resources. In this paper, we propose two general methodologies for the design of adaptive routing algorithms for networks with irregular topology. Routing algorithms designed according to these methodologies allow messages to follow minimal paths in most cases, reducing message latency and increasing network throughput. As an example of application, we propose two adaptive routing algorithms for ANI (previously known as Autonet). They can be implemented either by duplicating physical channels or by splitting each physical channel into two virtual channels. In the former case, the implementation does not require a new switch design. It only requires changing the routing tables and adding links in parallel with existing ones, taking advantage of spare switch ports. In the latter case, a new switch design is required, but the network topology is not changed. Evaluation results for several different tapologies and message distributions show that the new routing algorithms are able to increase throughput for random traffic by a factor of up to 4 with respect to the original up*/down* algorithm, also reducing latency significantly. For other message distributions, throughput is increased more than seven times. We also show that most of the improvement comes from the use of minimal routing  相似文献   

15.
Networks of workstations are becoming increasingly popular as a cost-effective alternative to parallel computers. Typically, these networks connect workstations using irregular topologies, providing the wiring flexibility, scalability, and incremental expansion capability required in this environment. Recently, we proposed two methodologies for the design of adaptive routing algorithms for networks with irregular topology, as well as fully adaptive routing algorithms for these networks. These algorithms increase throughput considerably with respect to previously existing ones, but require the use of at least two virtual channels. In this paper, we propose a very efficient flow control protocol to support virtual channels when link wires are very long and/or have different lengths. This flow control protocol relies on the use of channel pipelining and control flits. Control traffic is minimized by assigning physical bandwidth to virtual channels until the corresponding message blocks or it is completely transmitted. Simulation results show that this flow control protocol performs as efficiently as an ideal network with short wires and flit-by-flit multiplexing. The effect of additional virtual channels per physical channel has also been studied, revealing that the optimal number of virtual channels varies with network size. The use of virtual channel priorities is also analyzed. The proposed flow control protocol may increase short message latency, due to long messages monopolizing channels and hindering the progress of short messages. Therefore, we have analyzed the impact of limiting the number of flits (block size) that a virtual channel may forward once it gets the link. Simulation results show that limiting the maximum block size causes the overall network performance to decrease  相似文献   

16.
Several analytical models of fully adaptive routing in wormhole-routed k-ary n-cubes under the uniform traffic pattern have recently been proposed in the literature. Although the uniform reference model has been widely used by researchers, it is not always true in practice as there are many applications that exhibit traffic nonuniformity. There has hardly been any study that describes an analytical model of fully adaptive routing under nonuniform traffic conditions. This paper describes the first analytical model of fully adaptive routing in k-ary n-cubes in the presence of nonuniform traffic generated by the digit-reversal permutation, which is an important communication operation found in many matrix computation problems. Results obtained through simulation experiments confirm that the model predicts message latency with a good degree of accuracy under different working conditions.  相似文献   

17.
The fat‐tree is one of the most common topologies among the interconnection networks of the systems currently used for high‐performance parallel computing. Among other advantages, fat‐trees allow the use of simple but very efficient routing schemes. One of them is a deterministic routing algorithm that has been recently proposed, offering a similar (or better) performance than adaptive routing while reducing complexity and guaranteeing in‐order packet delivery. However, as other deterministic routing proposals, this deterministic routing algorithm cannot react when high traffic loads or hot‐spot traffic scenarios produce severe contention for the use of network resources, leading to the appearance of Head‐of‐Line (HoL) blocking, which spoils the network performance. In that sense, we describe in this paper two simple, cost‐effective strategies for dealing with the HoL‐blocking problem that may appear in fat‐trees with the aforementioned deterministic routing algorithm. From the results presented in the paper, we conclude that, in the mentioned environment, these proposals considerably reduce HoL‐blocking without significantly increasing switch complexity and the required silicon area. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

18.
As networks become larger, scalability and QoS-awareness become important issues that have to be resolved. A large network can be effectively formed as a hierarchical structure, such as the inter/intra-domain routing hierarchy in the Internet and the Private Network-to-Network Interface (PNNI) standard, to resolve these critical issues. Methods of modeling and analyzing the performance of QoS-capable hierarchical networks become an open issue. Although the reduced load approximation technique has been extensively applied to flat networks, the feasibility of applying it to the hierarchical network model has seldom been investigated. Furthermore, most of the research in this area has focused on the performance evaluation with fixed routing. This work proposes an analytical model for evaluating the performance of adaptive hierarchical networks with multiple classes of traffic. We first study the reduced load approximation model for multirate loss networks, and then propose a novel performance evaluation model for networks with hierarchical routing. This model is based on a decomposition of a hierarchical route into several analytic hierarchical segments; therefore the blocking probability of the hierarchical path can be determined from the blocking probabilities of these segments. Numerical results demonstrate that the proposed model for adaptive hierarchical routing yields accurate blocking probabilities. We also investigate the convergence of the analysis model in both the originating-destination (O-D) pair and the alternative hierarchical path. Finally, the blocking probability of the adaptive hierarchical O-D pair is demonstrated to depend on the blocking of all hierarchical paths but not on the order of the hierarchical path of the same O-D pair.  相似文献   

19.
A theory for the design of deadlock-free adaptive routing algorithms for wormhole networks, proposed by the author (1991, 1993), supplies sufficient conditions for an adaptive routing algorithm to be deadlock-free, even when there are cyclic dependencies between channels. Also, two design methodologies were proposed. Multicast communication refers to the delivery of the same message from one source node to an arbitrary number of destination nodes. A tree-like routing scheme is not suitable for hardware-supported multicast in wormhole networks because it produces many headers for each message, drastically increasing the probability of a message being blocked. A path-based multicast routing model was proposed by Lin and Ni (1991) for multicomputers with 2D-mesh and hypercube topologies. In this model, messages are not replicated at intermediate nodes. This paper develops the theoretical background for the design of deadlock-free adaptive multicast routing algorithms. This theory is valid for wormhole networks using the path-based routing model. It is also valid when messages with a single destination and multiple destinations are mixed together. The new channel dependencies produced by messages with several destinations are studied. Also, two theorems are proposed, developing conditions to verify that an adaptive multicast routing algorithm is deadlock-free, even when there are cyclic dependencies between channels. As an example, the multicast routing algorithms of Lin and Ni are extended, so that they can take advantage of the alternative paths offered by the network  相似文献   

20.
Integrated real-time dynamic routing (IRR) networks provide dynamic routing features for multiple classes-of-service on an integrated transport network. In this paper it is shown that IRR networks allow reduced network management costs since with real-time dynamic routing a number of network operations are simplified or eliminated. These simplifications include eliminating the storage of voluminous routing tables in the network switches, eliminating the calculation of routing tables in network design, simplifying the routing administration operations which require downloading new routing information to the network, and eliminating the automatic rerouting function in on-line traffic management. A new bandwidth allocation technique is described here which is based on the optimal solution of a network bandwidth allocation model for IRR networks. The model achieves significant improvement in both the average network blocking and node pair blocking distribution when the network is in a congested state such as under peak-day loads. In a paper to appear in the next Journal issue we further describe a new algorithm for the transport design of IRR networks which achieves near-optimal capacity engineering. These optimization techniques attain significant capital cost reductions and network performance improvements by properly modeling the more efficient operation of IRR networks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号