期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Deadlock free routing algorithms for irregular mesh topology NoC systems with rectangular regions

Rickard Maurizio Shashi 《Journal of Systems Architecture》2008,54(3-4):427-440

The simplicity of regular mesh topology Network on Chip (NoC) architecture leads to reductions in design time and manufacturing cost. A weakness of the regular shaped architecture is its inability to efficiently support cores of different sizes. A proposed way in literature to deal with this is to utilize the region concept, which helps to accommodate cores larger than the tile size in mesh topology NoC architectures. Region concept offers many new opportunities for NoC design, as well as provides new design issues and challenges. One of the most important among these is the design of an efficient deadlock free routing algorithm. Available adaptive routing algorithms developed for regular mesh topology cannot ensure freedom from deadlocks. In this paper, we list and discuss many new design issues which need to be handled for designing NoC systems incorporating cores larger than the tile size. We also present and compare two deadlock free routing algorithms for mesh topology NoC with regions. The idea of the first algorithm is borrowed from the area of fault tolerant networks, where a network topology is rendered irregular due to faults in routers or links, and is adapted for the new context. We compare this with an algorithm designed using a methodology for design of application specific routing algorithms for communication networks. The application specific routing algorithm tries to maximize adaptivity by using static and dynamic communication requirements of the application. Our study shows that the application specific routing algorithm not only provides much higher adaptivity, but also superior performance as compared to the other algorithm in all traffic cases. But this higher performance for the second algorithm comes at a higher area cost for implementing network routers. 相似文献

2.

一种IBA规则网络的路由算法及其网络模拟 总被引：1，自引：0，他引：1

金红曹志强许勇《计算机工程与应用》2005,41(3):162-166,173

InfiniBand协议越来越得到网络互连界的认可。它定义了一种自由的网络拓扑。目前多数场合使用的是不规则IBA网络,采用通用的up/down路由算法;但是将up/down算法直接用于IBA网络时,需要以损失网络性能为代价的路径修正才能避免网络死锁[1,2]。为了满足用户的特殊需求,保证网络的高带宽、低延迟,构造了基于4元N树的IBA规则网络拓扑,给出其单播和多播路由算法,并建立一个较为完整的IBA系统模型,用于模拟网络的可行性以及算法的正确性。相似文献

3.

Incremental design of scalable interconnection networks using basic building blocks

Yang M. Ni L.M. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(11):1126-1140

In this paper, we present an incremental design of scalable interconnection networks in multicomputer systems using basic building blocks. Both network topologies and routing algorithms are considered. We use wormhole-routed small-scale 2D meshes as basic building blocks. The minimum requirement to expand these networks is a single building block. This implies that the network does not have to maintain the regular 2D mesh topology. Some new topologies are introduced: incomplete meshes based on those adaptive routing algorithms designed from the turn model and extended incomplete meshes based on XY routing. We show that the original routing algorithm can be adopted to send a message between any source and destination without using store-and-forward and causing deadlock. The way that the network is constructed incrementally requires no or a very small amount of rewiring and keeps high bisection density and short diameter of the network. The design methods can be used to economically and incrementally build expandable and scalable parallel computers. 相似文献

4.

An effective methodology to improve the performance of the up*/down* routing algorithm

Sancho J.C. Robles A. Duato J. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(8):740-754

Networks of workstations (NOWs) are being considered as a cost-effective alternative to parallel computers. Most NOWs are arranged as a switch-based network and provide mechanisms for discovering the network topology. Hence, they provide support for both regular and irregular topologies, which makes routing and deadlock avoidance quite complicated. Current proposals use the up*/down* routing algorithm to remove cyclic dependencies between channels and avoid deadlock. However, routing is considerably restricted and most messages must follow nonminimal paths, increasing latency and wasting resources. We propose and evaluate a simple and effective methodology to compute up*/down* routing tables. The new methodology is based on computing a depth-first search (DPS) spanning tree on the network graph that decreases the number of routing restrictions with respect to the breadth-first search (BFS) spanning tree used by the traditional methodology. Additionally, we propose different heuristic rules for computing the spanning trees to improve the efficiency of up*/down* routing. Evaluation results for several different topologies show that computing the up*/down* routing tables by using the new methodology increases throughput by a factor of up to 2.48 in large networks with respect to the traditional methodology, and also reduces latency significantly. 相似文献

5.

A protocol for deadlock-free dynamic reconfiguration in high-speedlocal area networks

Casado R. Bermudez A. Duato J. Quiles F.J. Sanchez J.L. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(2):115-132

High-speed local area networks (LANs) consist of a set of switches interconnected by point-to-point links, and hosts linked to those switches through a network interface card. High-speed LANs may change their topology due to switches being turned on/off, hot expansion, link remapping, and component failures. In these cases, a distributed reconfiguration protocol analyzes the topology, computes the new routing tables, and downloads them to the corresponding switches. Unfortunately, in most cases, user traffic is stopped during the reconfiguration process to avoid deadlock. These strategies are called static reconfiguration techniques. Although network reconfigurations are not frequent, static reconfiguration such as this may take hundreds of milliseconds to execute, thus degrading system availability significantly. Several distributed real-time applications have strict communication requirements; Distributed multimedia applications have similar, although less strict, quality of service (QoS) requirements. Both stopping packet transmission and discarding packets due to the reconfiguration process prevent the system from satisfying the above requirements. Therefore, in order to support hard real-time and distributed multimedia applications over a high-speed LAN, we need to avoid stopping user traffic and discarding packets when the topology changes. In this paper, we propose a new deadlock-free distributed reconfiguration protocol that is able to asynchronously update routing tables without stopping user traffic. This protocol is valid for any topology, including regular as well as irregular topologies. It is also valid for packet switching as well as for cut-through switching techniques and does not rely on the existence of virtual channels to work. Simulation results show that the behavior of our protocol is significantly better than for other protocols based on stopping user traffic 相似文献

6.

基于动态规划的无线传感器网络的路由算法 总被引：4，自引：2，他引：4

杨文国郭田德赵彤《计算机研究与发展》2007,44(5):890-897

路由问题是无线传感器网络中的核心问题之一,其数据传送的多跳特点使得非常适合用动态规划的原理来设计传感器网络的路由算法.基于动态规划,通过节点跳数生成算法为传感器网络中的每个节点赋一个表示到Sink点跳数的节点跳数值,并分析了传感器网络的拓扑结构特点,然后给出了无线传感器网络中寻找从源到汇满足不同设计目标的最小跳数(MinH)、最小跳数最大剩余能量(MinHMaxRE)和最小跳数最小费用(MinHMinC)3种路由算法.探讨了最小跳数最小费用路由与最小费用路由之间的关系,并给出了判断最小跳数最小费用路径就是最小费用路径的一个充要条件.算法的能量消耗分析表明,所给路由算法能实现大幅度的能量节省. 相似文献

7.

Adaptive inter-layer message routing in 3D networks-on-chip

Claudia RusuAuthor Vitae Lorena AnghelAuthor Vitae 《Microprocessors and Microsystems》2011,35(7):613-631

Existing routing algorithms for 3D deal with regular mesh/torus 3D topologies. Today 3D NoCs are quite irregular, especially those with heterogeneous layers. In this paper, we present a routing algorithm targeting 3D networks-on-chip (NoCs) with incomplete sets of vertical links between adjacent layers. The routing algorithm tolerates multiple link and node failures, in the case of absence of NoC partitioning. In addition, it deals with congestion. The routing algorithm for 3D NoCs preserves the deadlock-free propriety of the chosen 2D routing algorithms. It is also scalable and supports a local reconfiguration that complements the reconfiguration of the 2D routing algorithms in case of failures of nodes or links. The algorithm incurs a small overhead in terms of exchanged messages for reconfiguration and does not introduce significant additional complexity in the routers. Theoretical analysis of the 3D routing algorithm is provided and validated by simulations for different traffic loads and failure rates. 相似文献

8.

A framework for designing deadlock-free wormhole routing algorithms

Boppana R.V. Chalasani S. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(2):169-183

This paper presents a framework to design fully-adaptive, deadlock-free wormhole algorithms for a variety of network topologies. The main theoretical contributions are: (a) design of new wormhole algorithms using store-and-forward algorithms, (b) a sufficient condition for deadlock free routing by the wormhole algorithms so designed, and (c) a sufficient condition for deadlock free routing by these wormhole algorithms with centralized flit buffers shared among multiple channels. To illustrate the theory, several wormhole algorithms based on store-and-forward hop schemes are designed. The hop-based wormhole algorithms can be applied to a variety of networks including torus, mesh, de Brujin, and a class of Cayley networks, with the best known bounds on virtual channels for minimal routing on the last two classes of networks. An analysis of the resource requirements and performances of a proposed algorithm, called negative-hop algorithm, with some of the previously proposed algorithms for torus and mesh networks is presented 相似文献

9.

On an efficient NoC multicasting scheme in support of multiple applications running on irregular sub-networks

Xiaohang WangAuthor Vitae Mei YangAuthor Vitae 《Microprocessors and Microsystems》2011,35(2):119-129

When a number of applications simultaneously running on a many-core chip multiprocessor (CMP) chip connected through network-on-chip (NoC), significant amount of on-chip traffic is one-to-many (multicast) in nature. As a matter of fact, when multiple applications are mapped onto an NoC architecture with applicable traffic isolation constraints, the corresponding sub-networks of these applications are mapped onto actually tend to be irregular. In the literature, multicasting for irregular topologies is supported through either multiple unicasting or broadcasting, which, unfortunately, results in overly high power consumption and/or long network latency. To address this problem, a simple, yet efficient hardware-based multicasting scheme is proposed in this paper. First, an irregular oriented multicast strategy is proposed. Literally, following this strategy, an irregular oriented multicast routing algorithm can be designed based on any regular mesh based multicast routing algorithm. One such algorithm, namely, Alternative Recursive Partitioning Multicasting (AL + RPM), is proposed based on RPM, which was designed for regular mesh topology originally. The basic idea of AL + RPM is to find the output directions following the basic RPM algorithm and then decide to replicate the packets to the original output directions or the alternative (AL) output directions based on the shape of the sub-network. The experiment results show that the proposed multicast AL + RPM algorithm can consume, on average, 14% and 20% less power than bLBDR (a broadcasting-based routing algorithm) and the multiple unicast scheme, respectively. In addition, AL + RPM has much lower network latency than the above two approaches. To incorporate AL + RPM into a baseline router to support multicasting, the area overhead is fairly modest, less than 5.5%. 相似文献

10.

An improved topology discovery algorithm for networks with wormhole routing and directed links

《Computer Networks》1999,31(1-2):79-88

We propose a new parallel topology discovery algorithm for irregular, mesh-connected networks with unidirectional links and wormhole routing. An algorithm of this type was developed for the ATOMIC high speed local area network to avoid the need for manually updating routing tables. Similar needs may arise in wireless networks where channels may be unidirectional because of limited transmission power, multipath, and similar effects. Like the ATOMIC topology discovery algorithm, our algorithm accumulates a map of the network at a distinguished node called the Address Consultant. However, our algorithm is much faster. In addition, our algorithm is more general, because it can correctly resolve topologies that contain multiple links between the same nodes. We implemented both algorithms in a concurrent simulation environment, and tested them on a variety of topologies. 相似文献

11.

基于图卷积神经网络的智能路由算法

唐鑫徐彦彦潘少明《计算机工程》2022,48(3):38-45

使用特定数学模型的路由转发算法难以满足用户多样化的服务质量需求,基于深度学习的智能路由方案因具有准确性、高效性、通用性等优势,成为路由决策的发展方向。然而,目前多数智能路由算法在网络拓扑动态变化时需要重新训练,造成路由更新不及时,难以应对网络拓扑动态变化。提出一种基于图卷积神经网络（GCN）的智能路由算法。线下利用提前采集的网络信息,根据路由开销标签训练GCN智能路由模型,通过该模型输出单跳路由开销。线上采集实时信息并根据模型输出的路由开销结果对网络层路由协议进行调整,计算最小路由开销的路由路径,实现自适应网络更新。算法利用GCN的图数据结构处理不规则的网络拓扑,通过图卷积算子自动提取特征解决路由网络多属性参数提取的问题,同时引入模糊C均值算法进行网络状态离散化分析,为数据集生成标签,从而有效监督GCN模型训练。实验结果表明,该算法较ECMP、DRL-TE和SmartRoute算法路由性能更好,其平均丢包率、时延和吞吐量指标均为最优,且相较于单一的流量模式具有更强的泛化能力。相似文献

12.

On basic properties of fault-tolerant multi-topology routing 总被引：1，自引：0，他引：1

Tarik 《Computer Networks》2008,52(18):3325-3341

Multi-topology routing has recently gained popularity as a simple yet efficient traffic engineering concept. Its basic purpose is to separate different classes of network traffic, which are then transported over disjoint logical topologies. Multi-topology routing is used as a basis for implementation of an IP fast reroute scheme called Multiple Routing Configurations (MRC).MRC has a range of attractive properties, but they do come at a cost. In order to guarantee recovery from any single link or node failure in the network, MRC has to maintain several logical topologies and thus an increased amount of routing information. The number of the logical topologies in MRC need not be large; even simple heuristic algorithms often yield good results in practice. However, why this is the case is not fully understood yet.In this paper, we introduce a theoretical framework for fault-tolerant multi-topology routing (FT-MTR). MRC is a practical implementation of FT-MTR in connectionless IP networks. We use FT-MTR to study how the internal topological structure of the communication network relates to two important problems. The first problem is minimizing the number of logical topologies and thus the routing state in FT-MTR. We show how to use the sets of nodes that separate the topology graph to devise an advanced heuristic for “intelligent” construction of the logical topologies. Finding the separating sets in a topology graph is computationally demanding; we present an algorithm that performs well in tested real network topologies. We evaluate the separation-set based heuristic for the logical topology construction and show that it outperforms the known MRC heuristics.The second problem is the FT-MTR load distribution after a failure. We use the separating sets to devise a novel algorithm for failure load distribution. This algorithm does not require knowledge of the traffic demand matrix, still, our tests indicate that it performs as good as, or better than, known MRC load-distribution algorithms that do require the demand matrix as input. 相似文献

13.

PDR: A protocol for dynamic network reconfiguration based on deadlock recovery scheme

《Simulation Modelling Practice and Theory》2012

Dynamic network reconfiguration is described as the process of replacing one routing function with another while the network keeps running. The main challenge is avoiding deadlock anomalies while keeping limitations on packet injection and forwarding minimal. Current approaches which have a high complexity and as a result have a limited practical applicability either require the existence of extra network resources, or they will affect the network performance during the reconfiguration process. In this paper we present a simple, fast and efficient mechanism for dynamic network reconfiguration which is based on regressive deadlock recovery instead of avoiding deadlock. The mechanism which is referred to as PDR guarantees a deadlock-free reconfiguration based on wormhole switching. In PDR, a particular approach is taken to handle both deadlocks and performance degradation. We propose the use of a packet injection restriction mechanism that prevents performance degradation near the saturation by controlling the network traffic. Further, in this approach, to accurately detect deadlocks, the deadlock detection mechanism is implemented and also improved by using only the local information, thereby considerably reducing false deadlock detections. In the rare cases when deadlocks are suspected, we propose a new technique that absorbs the deadlocked packet at the current node instead of dropping deadlocked packets and re-injects it later into the network. The main advantage of this method is its simplicity and also it does not require any additional buffers in intermediate nodes to handle deadlocks. It requires only some buffer space in the local node to temporarily hold the deadlocked packets removed from the network. Evaluating results reveal that the mechanism shows substantial performance improvements over the other methods and it works efficiently in different topologies with various routing algorithms. 相似文献

14.

Logic-Based Distributed Routing for NoCs

《Computer Architecture Letters》2008,7(1):13-16

The design of scalable and reliable interconnection networks for multicore chips (NoCs) introduces new design constraints like power consumption, area, and ultra low latencies. Although 2D meshes are usually proposed for NoCs, heterogeneous cores, manufacturing defects, hard failures, and chip virtualization may lead to irregular topologies. In this context, efficient routing becomes a challenge. Although switches can be easily configured to support most routing algorithms and topologies by using routing tables, this solution does not scale in terms of latency and area. We propose a new circuit that removes the need for using routing tables. The new mechanism, referred to as Logic-Based Distributed Routing (LBDR), enables the implementation in NoCs of many routing algorithms for most of the practical topologies we might find in the near future in a multicore chip. From an initial topology and routing algorithm, a set of three bits per switch output port is computed. By using a small logic block, LBDR mimics (demonstrated by evaluation) the behavior of routing algorithms implemented with routing tables. This result is achieved both in regular and irregular topologies. Therefore, LBDR removes the need for using routing tables for distributed routing, thus enabling flexible, fast and power-efficient routing in NoCs. 相似文献

15.

Bandwidth Efficient All-to-All Broadcast on Switched Clusters

Ahmad Faraj Pitch Patarasuk Xin Yuan 《International journal of parallel programming》2008,36(4):426-453

Clusters of workstations employ flexible topologies: regular, irregular, and hierarchical topologies have been used in such systems. The flexibility poses challenges for developing efficient collective communication algorithms since the network topology can potentially have a strong impact on the communication performance. In this paper, we consider the all-to-all broadcast operation on clusters with cut-through and store-and-forward switches. We show that near-optimal all-to-all broadcast on a cluster with any topology can be achieved by only using the links in a spanning tree of the topology when the message size is sufficiently large. The result implies that increasing network connectivity beyond the minimum tree connectivity does not improve the performance of the all-to-all broadcast operation when the most efficient topology specific algorithm is used. All-to-all broadcast algorithms that achieve near-optimal performance are developed for clusters with cut-through and clusters with store-and-forward switches. We evaluate the algorithms through experiments and simulations. The empirical results confirm our theoretical finding. 相似文献

16.

Characterization of Deadlocks in Irregular Networks

《Journal of Parallel and Distributed Computing》2002,62(1):61-84

This work characterizes how various network parameters influence message blocking and deadlocks in irregular networks. Information on blocking behavior is provided that is useful in making design trade-offs between restricting routing freedom and allowing the possibility for deadlocks to form in irregular networks. This paper also identifies ways in which a network's susceptibility to deadlock can be reduced and provides guidelines for designing irregular networks that maximize routing flexibility and resource utilization. Finally, a new empirical evaluation methodology for classifying irregular topologies and relating network behavior to various classes of network topologies is introduced. 相似文献

17.

具有原路返回特征的改进OSRM胖树路由算法研究

曹继军郑义王克非肖立权《计算机工程与科学》2014,36(6):997-1004

胖树是最重要的互连网络拓扑结构之一。针对胖树拓扑结构,已经提出了多种路由算法,其中OSRM被证明是一种最优化的路由算法,但是所有算法都忽略了网络链路故障的易诊断性。为此,提出一种对OSRM改进的新型路由算法BT-OSRM。该算法定义了节点间的大小关系并通过比较节点大小而从OSRM路由路径与其反向路径中选择路由路径。此外,还针对常用的2级和3级胖树结构,分别详细给出了BT-OSRM2和BT-OSRM3路由算法。理论分析表明,BT OSRM路由算法不但继承了OSRM路由算法无死锁、负载均衡和性能最优等优点,而且保证了任意两节点间的路由路径具有原路返回特性,从而提高了网络故障链路的易诊断性。相似文献

18.

A general theory for deadlock-free adaptive routing using a mixedset of resources

Duato J. Pinkston T.M. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(12):1219-1235

This paper presents a theoretical framework for the design of deadlock-free fully adaptive routing algorithms for a general class of network topologies and switching techniques in a single, unified theory. A general theory is proposed that allows the design of deadlock avoidance-based as well as deadlock recovery-based wormhole and virtual cut-through adaptive routing algorithms that use a homogeneous or a heterogeneous (mixed) set of resources. The theory also allows channel queues to be allocated nonatomically, utilizing resources efficiently. A general methodology for the design of fully adaptive routing algorithms applicable to arbitrary network topologies is also proposed. The proposed theory and methodology allow the design of efficient network routers that require minimal resources for handling infrequent deadlocks 相似文献

19.

High-performance routing in networks of workstations with irregulartopology

Silla F. Duato J. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(7):699-719

Networks of workstations are rapidly emerging as a cost-effective alternative to parallel computers. Switch-based interconnects with irregular topology allow the wiring flexibility, scalability, and incremental expansion capability required in this environment. However, the irregularity also makes routing and deadlock avoidance on such systems quite complicated. In current proposals, many messages are routed following nonminimal paths, increasing latency and wasting resources. In this paper, we propose two general methodologies for the design of adaptive routing algorithms for networks with irregular topology. Routing algorithms designed according to these methodologies allow messages to follow minimal paths in most cases, reducing message latency and increasing network throughput. As an example of application, we propose two adaptive routing algorithms for ANI (previously known as Autonet). They can be implemented either by duplicating physical channels or by splitting each physical channel into two virtual channels. In the former case, the implementation does not require a new switch design. It only requires changing the routing tables and adding links in parallel with existing ones, taking advantage of spare switch ports. In the latter case, a new switch design is required, but the network topology is not changed. Evaluation results for several different tapologies and message distributions show that the new routing algorithms are able to increase throughput for random traffic by a factor of up to 4 with respect to the original up*/down* algorithm, also reducing latency significantly. For other message distributions, throughput is increased more than seven times. We also show that most of the improvement comes from the use of minimal routing 相似文献

20.

Adaptive-trail routing and performance evaluation in irregularnetworks using cut-through switches

Wenjian Qiao Ni L.M. Rokicki T. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(11):1138-1158

Cut-through switching promises low latency delivery and has been used in new generation switches, especially in high speed networks demanding low communication latency. The interconnection of cut-through switches provides an excellent network platform for high speed local area networks (LANs). For cost and performance reasons. Irregular topologies should be supported in such a switch-based network. Switched irregular networks are truly incrementally scalable and have potential to be reconfigured to adapt to the dynamics of network traffic conditions. Due to the arbitrary topologies of networks, it is critical to develop an efficient deadlock-free routing algorithm. A novel deadlock-free adaptive routing algorithm called adaptive-trail routing is proposed to allow irregular interconnection of cut-through switches. The adaptive routing algorithm is based on two unidirectional adaptive trails constructed from two opposite unidirectional Eulerian trails. Some heuristics are suggested in terms of the selection of Eulerian trails, the avoidance of long routing paths, and the degree of adaptivity. Extensive simulation experiments are conducted to evaluate the performance of the proposed and two other routing algorithms under different topologies and traffic workloads 相似文献