首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 156 毫秒
1.
由于采用高维胖树拓扑结构的高性能计算机系统中叶交换机故障将严重影响系统使用,为了提高系统的可用性和可维性,基于误路由的思想提出了一套适用于高维胖树拓扑的确定性路由容错策略。其基本思路是通过误路由绕过发生故障的叶交换机,跳转至同维中其他叶交换机后,再通过正常路由到达目的节点。该容错策略可在不影响系统使用的情况下,实现故障叶交换机的屏蔽,并在实际的高维胖树系统中进行了容错实验。实验结果表明,该容错策略取得了可快速屏蔽故障叶交换机的预期效果,可以有效地提高系统维护的效率。  相似文献   

2.
胖树中的分布式动态容错路由   总被引:1,自引:0,他引:1  
面向云计算的超大规模互连网络增加了对网络容错的要求,容错已成为互连网络的重要问题.为了保证网络的高可用性和高性能,文中基于胖树网络拓扑提出了一种分布式的动态容错路由方法.该方法通过引入一套链路失效消息传播机制和一套基于链路失效信息的动态容错路由算法来实现胖树网络的分布式动态容错.相比已有方法,该方法不增加网络硬件和路由路径长度,并且具有高执行效率和高性能.实验结果表明,在m端口交换机构成的胖树中,该方法可以容忍任意m/2-1条失效链路并以高概率容忍更多条失效链路的组合,同时保持网络的高性能.  相似文献   

3.
提出一种基于2D-Mesh只使用2条虚通道的容错路由算法,少于需要4条虚通道的Boppana算法,以及需要3条虚通道的Duan算法。算法基于块故障模型,故障块可以是f-ring,也可以是f-chain。无故障时算法用最短路径路由消息,当消息被故障块阻塞时使用绕道策略进行路由。在不重叠和重叠故障区情况下分别给出算法无死锁性的证明过程。  相似文献   

4.
路由生成是构建源路由胖树互连网络的重要步骤之一。针对源路由胖树网络路由生成问题,采用面向对象的方法,首先建立胖树网络的拓扑结构模型并提出分段路由方法,接着研究路由生成、正确性验证、路径查询的相关算法,最后探讨路由生成验证与查询软件的设计与实现。目前,该软件已经成功应用于多个源路由胖树网络的路由生成和故障路径诊断过程中。  相似文献   

5.
随着芯片复杂度的不断增大,设计一个高效的片上网络容错路由算法面临着巨大的挑战。由于芯片面积开销的限制,拥有低面积开销的无虚通道片上网络路由器受到学术界的广泛关注。但目前对无虚通道片上网络容错路由算法的研究却停留在容错性能上,而忽略了容错路由算法的路由路径过于单一所造成的负载不均、数据包平均延迟较大等问题。文章在借鉴已有的奇偶转向容错路由算法的基础上,对算法的故障模型和故障绕行策略进行优化,并在算法中融入负载均衡策略,以形成新的容错算法缓解上述问题。在9x9的2D mesh网络中对新提出的算法和参考算法的仿真结果表明:与参考算法相比,新算法在降低数据延迟和吞吐量方面有着明显的优势,在最优情况下能减少8.92%数据延迟和增加10.46%的吞吐量。  相似文献   

6.
链路和节点的故障会导致网络中许多节点无法相互通讯,因此容错性是NoC系统设计中的一个重要问题。基于一种新的NoC网络拓扑结构PRDT(2,1),提出一种PRDT(2,1)容错路由算法以及相应的节点失效算法。节点失效算法通过使较少数量的无故障节点失效来构造矩形故障区域,PRDT(2,1)容错路由算法仅使用了最小数量的虚拟通道并提供足够的自适应性以实现无死锁容错路由。只要故障区域没有断开网络,这一算法能够保证路由的连通性。算法在不同故障率的PRDT(2,1)网络中仿真,结果显示这一算法具有良好的平滑降级使用特性。  相似文献   

7.
胡哲琨  杨升春  陈杰 《计算机应用》2016,36(5):1201-1205
为了减小路由表的规模且避免使用较多虚通道(VC),从而降低硬件资源用量,针对虫孔交换的2D Mesh片上网络提出了一种分区容错路由(RFTR)算法。该算法根据故障节点和链路的位置将2D Mesh网络划分为若干个相连的矩形区域,数据包在矩形区域内可使用确定性或自适应路由算法进行路由,而在区域间则按照up*/down*算法确定路由路径。此外,利用通道依赖图(CDG)模型,证明了该算法仅需两个虚通道就能避免死锁。在6×6 Mesh网络中,RFTR算法能减少25%的路由表资源用量。仿真结果表明,在队列缓存资源相同的情况下,RFTR算法能实现与up*/down*算法和segment算法相当甚至更优的性能。  相似文献   

8.
胖树是最重要的互连网络拓扑结构之一。针对胖树拓扑结构,已经提出了多种路由算法,其中OSRM被证明是一种最优化的路由算法,但是所有算法都忽略了网络链路故障的易诊断性。为此,提出一种对OSRM改进的新型路由算法BT-OSRM。该算法定义了节点间的大小关系并通过比较节点大小而从OSRM路由路径与其反向路径中选择路由路径。此外,还针对常用的2级和3级胖树结构,分别详细给出了BT-OSRM2和BT-OSRM3路由算法。理论分析表明,BT OSRM路由算法不但继承了OSRM路由算法无死锁、负载均衡和性能最优等优点,而且保证了任意两节点间的路由路径具有原路返回特性,从而提高了网络故障链路的易诊断性。  相似文献   

9.
面向存在永久性链接故障的非规则三维片上网络,提出一种低成本自适应可靠路由方法.首先根据非规则三维片上网络的拓扑结构,优先选择一条汉密尔顿路径进行容错路由,在没有汉密尔顿路径的情况下,则执行生成树容错路由算法绕过故障链接;然后将基于动态规划的端口选择机制拓展到三维空间,结合前述路由算法来避开网络冲突区域,完成将数据包由源路由器节点传输至目的路由器节点的路由过程.实验结果表明,与之前的AFRA方法和基于生成树的可靠路由方法相比,该方法具有较高的通信性能和可靠性,同时所需的网络开销较低.  相似文献   

10.
易怡  樊建席  王岩  刘钊  董辉 《计算机科学》2021,48(6):253-260
BCube是具有良好性能的数据中心网络.相比传统的树形数据中心网络,BCube在扩展和容错性能方面都表现出很大的优势.目前,对于BCube的研究可以归结为对其逻辑图BCn,k(广义超立方体的一种特例)的研究,其中交换机被视为透明设备.在实际应用中,随着网络规模的不断增加,顶点发生故障已经成为一种常态.因此,研究网络的容错路由很有意义.目前,有不少关于BCn,k容错路由的研究,但其2-限制连通度下的容错路由目前还没有被研究.在提出容错路由算法之前,首先证明了BCn,k的2-限制连通度为3(k+1)(n-1)-2n,其中k≥3且n≥3.然后在此基础上提出了一个时间复杂度为O(κ(BCn,k)3)的容错路由算法,其中κ(BCn,k)=(k+1)(n-1)是BCn,k的连通度.该算法可以在故障顶点个数小于3(k+1)(n-1)-2n且每个无故障顶点至少有两个无故障邻居时找到任意两个不同的无故障顶点之间的一条无故障路径.  相似文献   

11.
Fault tolerance in the interconnection network of large clusters of PCs is an issue of growing importance, since their increasing size also increases the failure probability. The fat-tree topology is usually used in these machines since it has become very popular among high-speed interconnect manufacturers. This paper proposes a new distributed fault-tolerant routing methodology for fat trees. Unlike other previous proposals, it does not require additional network hardware, and its memory requirements, switch hardware, and routing delay scales up with the network size. Indeed, it nullifies only the strictly necessary paths, allowing adaptive routing through the healthy paths. The methodology is based on enhancing the interval routing scheme with exclusion intervals. Exclusion intervals are associated to each switch output port and represent the nodes that are unreachable from this port after a fault. We propose a methodology to identify the links where the exclusion intervals must be updated after a fault, the values to write on them, and a very efficient mechanism to distribute the required information through the network without stopping the system activity. Our methodology can tolerate a high number of network failures with a low degradation in performance. Moreover, it can achieve zero packet losing during the updating period.  相似文献   

12.
High-speed interconnection networks are essential elements for different high-performance parallel-computing systems. One of the most common interconnection network topologies is the fat-tree, whose advantages have turned it into the favorite topology of many interconnect designers. One of these advantages is the possibility of using simple but efficient routing algorithms, like the recently proposed deterministic routing algorithm referred to as DET, which offers similar (or better) performance than Adaptive Routing while reducing complexity and guaranteeing in-order packet delivery. However, as other deterministic routing proposals, DET cannot react when packets intensely contend for network resources, leading to the appearance of Head-of-Line (HoL) blocking which spoils network performance. In this paper, we describe and evaluate a simple queue scheme that efficiently reduces HoL-blocking in fat-trees using the DET routing algorithm, without significantly increasing switch complexity and required silicon area. Additionally, we propose an implementation of OBQA in a feasible switch architecture.  相似文献   

13.
Existing solutions for fault-tolerant routing in interconnection networks either work for only one given regular topology, or require slow and costly network reconfigurations that do not allow full and continuous network access. In this paper, we present FRroots, a routing method for fault tolerance in topology-flexible network technologies. Our method is based on redundant paths, and can handle single dynamic faults without sending control messages other than those that are needed to inform the source nodes of the failing component. Used in a modus with local rerouting, the source nodes need not be informed and no control messages are necessary for the network to stay connected despite of a single fault. In fault-free networks under nonuniform traffic our routing method performs comparable to, or even better than, topology specific routing algorithms in regular networks like meshes and tori. FRoots does not require any other features in the switches or end nodes than a flexible routing table, and a modest number of virtual channels. For that reason, it can be directly applied to several present day technologies like InfiniBand and Advanced Switching.  相似文献   

14.
This paper presents fault-tolerant protocols for fast packet switch networks withconvergence routing. The objective is to provide fast reconfiguration and continuous host-to-host communication after a link or a node (switch) failure,Convergence routingcan be viewed as a variant ofdeflection routing,which combines, in a dynamic fashion, the on-line routing decision with the traffic load inside the network. Unlike other deflection techniques, convergence routing operates withglobal sense of directionand guarantees that packets will reach or converge to their destinations. Global sense of direction is achieved by embedding of virtual rings to obtain a linear ordering of the nodes. We consider virtual ring embeddings over (i) a single spanning tree, and (ii) over two edge-disjoint spanning trees. Thus, the fault-tolerant solution is based on spanning trees and designed for a switch-based (i.e., arbitrary topology) architecture called MetaNet. In this work, the original MetaNet's convergence routing scheme has been modified in order to facilitate the property that the packet header need not be recomputed after a failure and/or a reconfiguration. This is achieved by having, at the network interface, a translator that maps the unique destination address to a virtual address. It is argued that virtual rings embedded over two-edge disjoint spanning trees increase the fault tolerance for both node and link faults and provides continuous host-to-host communication.  相似文献   

15.
In wormhole meshes, a reliable routing is supposed to be deadlock-free and fault-tolerant. Many routing algorithms are able to tolerate a large number of faults enclosed by rectangular blocks or special convex, none of them, however, is capable of handling two convex fault regions with distance two by using only two virtual networks. In this paper, a fault-tolerant wormhole routing algorithm is presented to tolerate the disjointed convex faulty regions with distance two or no less, which do not contain any nonfaulty nodes and do not prohibit any routing as long as nodes outside faulty regions are connected in the mesh network. The processors' overlapping along the boundaries of different fault regions is allowed. The proposed algorithm, which routes the messages by X-Y routing algorithm in fault-free region, can tolerate convex fault-connected regions with only two virtual channels per physical channel, and is deadlock- and livelock-free. The proposed algorithm can be easily extended to adaptive routing.  相似文献   

16.
在大规模并行系统中,系统级互连网络的设计至关重要.InfiniBand作为一种高性能交换式网络被广泛应用于大规模并行处理系统中.mesh/torus拓扑结构相较于目前普遍应用于InfiniBand网络的胖树拓扑结构拥有更好的性能与可扩展性.尽管如此,研究发现,用传统的mesh/torus拓扑结构构建InfiniBand互连网络存在诸多问题.分析了传统网络拓扑结构的缺陷,并提出了一种基于InfiniBand的多链路mesh/torus互连网络.这种改进型的拓扑结构通过充分利用交换机间的多链路可以获得比传统mesh/torus网络更高的带宽.另外,同时给出了与该网络拓扑结构相配套的高效路由算法.最后,通过网络仿真技术对提出的算法进行了评估,实验结果显示提出的路由算法相较于其他路由算法拥有更好的性能与可扩展性.  相似文献   

17.
This paper presents the new Flexible Hypercube architecture. The Flexible Hypercube is a fault-tolerant network topology that can be constructed for an arbitrary number of nodes and is incrementally expandable. This topology maintains the strong features of the Hypercube while overcoming deficiencies in expandability. It is shown to have strong node connectivity, a small diameter, and to be tolerant of faults. The Flexible Hypercube is a suitable architecture for the design of both tightly coupled parallel systems and distributed systems. Efficient fault-free and fault-tolerant algorithms for message routing and broadcasting are presented for the architecture. The fault-free algorithm guarantees successful routing inO(logN) time, whereNis the number of nodes in the system, and the fault-tolerant algorithm guarantees routing to functioning nodes if a route exists. The fault-free and fault-tolerant broadcasting algorithms have time complexityO(logN), and the fault-tolerant algorithm guarantees success if no two faulty nodes are adjacent and no functioning node is adjacent to two faults.  相似文献   

18.
Mesh网络耐故障虫孔路由   总被引:1,自引:1,他引:0  
耐故障是互连网络设计中的一个重要问题。本文提出了一种新的耐故障路由算法,并将其应用于使用虫孔交换技术的Mesh网络。由于使用了较低的路由限制,这一算法具有很强的自适应性,可以在各种不同故障域的Mesh网络中保持路由的连通性和无死锁性;由于使用了最小限度的虚拟通道,这一算法所需的缓冲器资源很少,非常适宜构建低成本的耐故障互连网络;由于根据本地故障信息进行绕行故障节点的决策,这一算法的路由决策速度较快并且易于在互连网络中实现。最后网络仿真试验显示,这一算法具有良好的平滑降级使用的性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号