首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The minimal routing problem in mesh-connected multicomputers with faulty blocks is studied. Two-dimensional meshes are used to illustrate the approach. A sufficient condition for minimal routing in 2D meshes with faulty blocks is proposed. Unlike many traditional models that assume all the nodes know global fault distribution, our approach is based on the concept of an extended safety level, which is a special form of limited fault information. The extended safety level information is captured by a vector associated with each node. When the safety level of a node reaches a certain level (or meets certain conditions), a minimal path exists from this node to any nonfaulty nodes in 2D meshes. Specifically, we study the existence of minimal paths at a given source node, limited distribution of fault information, and minimal routing itself. We propose three fault-tolerant minimal routing algorithms which are adaptive to allow all messages to use any minimal path. We also provide some general ideas to extend our approaches to other low-dimensional mesh-connected multicomputers such as 2D tori and 3D meshes. Our approach is the first attempt to address adaptive and minimal routing in 2D meshes with faulty blocks using limited fault information  相似文献   

2.
网格结构是并行与分布式处理中最流行的一种网络拓扑结构。在存在故障的情况下,如何设计具有最优性的容错路由算法一直是研究的热,点问题。本文研究了采用故障块模型的二维网格的最小路由问题,提出存在最小通路的一个充分必要条件。基于最小通路区(RMP)的概念,提出一种自适应的最小容错路由算法。如果源节点和目的节点之间存在最小通路区,则在最小通路区中进行自适应最小容错路由;反之,则采用多阶段最小容错路由。主要思想就是在存在故障的情况下,尽量保证路由算法能走最短路径。因为只要求知道每个节点的局部信息,故算法是分布式的。  相似文献   

3.
Fault—Tolerant Tree—Based Multicasting in Mesh Multicomputers   总被引:1,自引:1,他引:0       下载免费PDF全文
We propose a fault-tolerant tree-based multicast algorithm for 2-dimensional (2-D) meshes based on the concept of the extended safety level which is a vector associated with each node to capture fault information in the neighborhood.In this approach each destination is reached through a minimum number of hops,In order to minimize the total number of traffic steps,three heuristic strategies are proposed.This approach can be easily implemented by pipelined circuit switching(PCS).A simulation study is conducted to measure the total number of traffic steps under different strategies.Our approach is the first attempt to address the faulttolerant tree-based multicast strategies.Our approach is the first attempt to address the faulttolerant tree-based multicast problem in 2-D meshes based on limited global information with a simple model and succinct information.  相似文献   

4.
在节点出现故障的情况下,如何保证网络节点之间的路由是一个重要的问题。将无向双环网络的节点按照最短路径访问方式映射到直角坐标系形成最优路由构图[CG(N;±r,±s)];基于该构图根据源节点和目的节点是否位于坐标轴上以及它们周围的故障节点数,提出故障节点封闭区和逃逸区的概念;存在故障逃逸区的情况下,源、目的节点之间仍然可以进行最优路由,针对出现故障节点封闭区而无法进行最优路由的情况下,增加等价节点形成扩展路由构图[ECG(N;±r,±s)],从而寻找容错路由;给出最优路由构图、扩展路由构图和容错路由的算法,并编程仿真了这些算法。  相似文献   

5.
在存在故障结点的网络中如何设计最小容错路由是网络容错研究中的一个热点问题。以存在矩形故障块的二维Torus网络为例,将扩展安全级运用到Torus中,对于网络中任意一对结点,给出存在最小路径的充要条件;并且结合扩展安全级的概念,给出建立最小通路区的方法,并用实验验证了方法的可行性。研究为存在故障结点的Torus网络寻找最小容错路径提供了理论依据。  相似文献   

6.
Jiang  Zhen  Wu  Jie 《The Journal of supercomputing》2003,25(3):255-275
In this paper, a fault-tolerant broadcast scheme in 2-D meshes with randomly generated faults is provided. This approach is based on an early work on time-step optimal broadcasting in square-shape fault-free 2-D meshes with optimal total communication distance (TCD). An extension to any rectangular-shape fault-free 2-D meshes is first given. The fault block model is used in which all faulty nodes in the system are contained in a set of disjoint blocks. The boundary lines of blocks divide the whole mesh into a set of fault-free polygons and a sequence of rectangular fault-free regions is derived from these polygons. The broadcast process is carried out at two levels: inter-region and intra-region. In the inter-region-level broadcast, the broadcast message is sent from a given source to a special node (called eye [1]) in each rectangular fault-free region. In the intra-region-level broadcast, the extended optimal fault-free broadcast is applied. Some analytical results are given including an upper bound of TCD.  相似文献   

7.
In wormhole meshes, a reliable routing is supposed to be deadlock-free and fault-tolerant. Many routing algorithms are able to tolerate a large number of faults enclosed by rectangular blocks or special convex, none of them, however, is capable of handling two convex fault regions with distance two by using only two virtual networks. In this paper, a fault-tolerant wormhole routing algorithm is presented to tolerate the disjointed convex faulty regions with distance two or no less, which do not contain any nonfaulty nodes and do not prohibit any routing as long as nodes outside faulty regions are connected in the mesh network. The processors' overlapping along the boundaries of different fault regions is allowed. The proposed algorithm, which routes the messages by X-Y routing algorithm in fault-free region, can tolerate convex fault-connected regions with only two virtual channels per physical channel, and is deadlock- and livelock-free. The proposed algorithm can be easily extended to adaptive routing.  相似文献   

8.
A new, rectilinear-monotone polygonally shaped fault block model, called Minimal-Connected-Component (MCC), was proposed in [D. Wang, A rectilinear-monotone polygonal fault block model for fault-tolerant minimal routing in mesh, IEEE Trans. Comput. 52 (3) (2003) 310–320] for minimal adaptive routing in mesh-connected multiprocessor systems. This model refines the widely used rectangular model by including fewer non-faulty nodes in fault blocks. The positions of source/destination nodes relative to faulty nodes are taken into consideration when constructing fault blocks. Adaptive routing algorithm was given in Wang (2003), that constructs a minimal “Manhattan” route avoiding all fault blocks, should such routes exist. However, if there are no minimal routes, we still need to find a route, preferably as short as possible. In this paper, we propose a heuristic algorithm that takes a greedy approach, and can compute a nearly shortest route without much overhead. The significance of this algorithm lies in the fact that routing is a frequently performed task, and messages need to get to their destinations as soon as possible. Therefore one would prefer to have a fast answer about which route to take (and then take it), rather than spend too much time working out an absolutely shortest route.  相似文献   

9.
针对Torus结构的多处理机系统中容错路由的问题,提出标志位概念,给出一个基于标志位的容错路由算法。存储于Torus网络中各节点的标志位记录系统中的故障信息,用于判定消息的源节点和目的节点之间是否存在最优通路。标志位的赋值可以通过与邻节点间的信息交换完成。  相似文献   

10.
Fault-tolerant message routing mechanism is a key to the performance of reliable multicomputers. Multicast refers to the delivery of the same message from a source node to an arbitrary number of destination nodes. This paper presents two types of partially adaptive fault tolerant multicast algorithms. The Type A algorithm can deliver messages to all destinations through shortest paths if each fault-free node has at most one faulty neighbor. The Type B algorithm can deliver messages to all destinations if the total number of faulty links and faulty nodes is less than the dimension of the hypercube. The proposed algorithms have the following important features: they are distributed, they only require local information to determine the paths, and they need very little additional message overhead. The performance of the algorithms, measured by the traffic created by the communication, is very close to that in fault-free hypercubes.  相似文献   

11.
自适应路由算法优于确定性路由算法   总被引:1,自引:0,他引:1  
在研究并行计算机系统的容错时。自适应路由算法是一个极为重要的研究课题.它是在网络结点出错时,算法通过可选择的路径进行路由.在每个结点具有独立的出错概率的模型下,研究Mesh网络上自适应路由算法和确定性路算法的性能.本文提出的技术使得我们能严格地推导出路由算法的成功的概率,从而能分析和比较算法的性能.研究结果表明自适应路由算法具有明显的优势:一方面确定性路算法需要全局错误信息而变得高效性,另一方面自适应路由算法对于结点出错和网络规模具有更好的健壮性而具有更高的成功概率.  相似文献   

12.
L. Verdoscia  R. Vaccaro 《Computing》1999,63(2):171-184
This paper presents an easy and straightforward routing algorithm for WK-recursive topologies. The algorithm, based on adaptive routing, takes advantage of the geometric properties of such topologies. Once a source node S and destination node D have been determined for a message communication, they characterize, at some level l, two virtual nodes and that respectively contain S but not D and D but not S. Such virtual nodes characterize other (where is the node degree for a fixed topology) virtual nodes of the same level that contain neither S nor D. Consequently, it is possible to locate triangles whose vertices are these virtual nodes with property to share the same path, called the self-routing path, directly connecting to . When the self-routing path is unavailable to transmit a message from S to D because of deadlock, fault, and congestion conditions, the routing strategy can follow what we call the triangle rule to deliver it. The proposed communication scheme has the advantage that 1) it is the same for all three conditions; 2) each node of a WK-recursive network, to transmit messages, does not require any information about their presence or location. Furthermore, this routing algorithm is able to tolerate up to faulty links. Received: July 19, 1998; revised May 17, 1999  相似文献   

13.
田绍槐  陆应平  张大方 《软件学报》2007,18(7):1818-1830
在网络可靠性研究中,设计较好的容错路由策略、尽可能多地记录系统中最优通路信息,一直是一项重要的研究工作.超立方体系统的容错路由算法分为可回溯算法和无回溯算法.一般说来,可回溯算法的优点是容错能力强:只要消息的源节点和目的节点有通路,该算法就能够找到把消息传递到目的地的路径;其缺点是在很多情况下传递路径不能按实际存在的最短路径传递.其代表是深度优先搜索(DFS)算法.无回溯算法是近几年人们比较关注的算法.该算法通过记录各邻接节点的故障信息,给路由算法以启发信息,使消息尽可能按实际存在的最短路径传递.这些算法的共同缺点是只能计算出Hamming距离不超过n的路由.在n维超立方体系统连通图中,如果系统存在大量的故障,不少节点对之间的最短路径大于n,因此,这些算法的容错能力差.提出了一个实例说明采用上述算法将遗失60%的路由信息.另外,由于超立方体的结构严格,实际中的真正超立方体系统不多.事实上,不少的网络系统可转换为具有大量错误节点和错误边的超立方体系统.因此,研究能适应具有大量错误节点和错误边的超立方体系统的容错路由算法是一个很有实际价值的工作.研究探讨了:(1) 定义广义超立方体系统;(2) 在超立方体系统中提出了节点通路向量(NPV)概念及其计算规则;(3) 提出了中转点技术,使得求NPV的计算复杂度降低到O(n);(4) 提出了基于NPV的广义超立方体系统最佳容错路由算法(OFTRS),该算法是一种分布式的和基于相邻节点信息的算法.由于NPV记录了超立方体系统全部最优通路和次最优通路的信息,在具有大量故障的情况下,它不会遗漏任何一条最优通路和次最优通路信息,从而实现了高效的容错路由.在这一点上,它优于其他算法.  相似文献   

14.
Wormhole routing is an advanced switching technique used in new generation multicomputers. Since such a machine may suffer serious performance degradation under heavy or uneven traffic load, an adaptive routing method is particularly called upon. In minimal fully adaptive routing, the paths between any source and destination pair to be used are exactly all the shortest paths. We propose in this paper a minimal fully adaptive routing algorithm for n-dimensional hypercube with (n+1)/2 virtual channels per physical channel.  相似文献   

15.
在3D-Mesh网络中的两种路由研究   总被引:3,自引:1,他引:2       下载免费PDF全文
在研究并行计算机系统容错时,路由算法是一个极为重要的研究课题。主要研究的是自适应路由算法和确定性路由算法在3D-Mesh网络上的性能。在每个结点具有独立的出错概率的模型下,提出的方法使得能够严格地推导出路由算法的成功概率,从而能够对算法进行分析和比较。研究结果表明,自适应路由算法具有明显的优势。一方面,自适应路由算法基于局部信息而变得高效;另一方面,自适应路由算法对于结点出错和网络规模具有更好的健壮性,而使其具有更高的成功概率。  相似文献   

16.
超立方体多处理机系统中基于扩展安全向量的容错路由   总被引:16,自引:3,他引:16  
针对超立方体结构的多处理机系统中存在链路故障的情况,修改了吴杰提出的安全向量的概念,提出了扩展安全向量的概念,并给出了一个基于扩展安全向量的容错路由算法,与基于安全向量的路由算法相比,基于扩展安全向量的路由算法搜索最优通路的能力有了非常大的提高,即使故障数较多时,它仍能保证把绝大多数源、目的节点间有最优通路和消息沿最优通路传递。超立方体结构中各节点扩展安全向量的赋值可以通过n-1轮邻接点的信息交换  相似文献   

17.
Message routing achieves the internode communication in parallel computers. A reliable routing is supposed to be deadlock-free and fault-tolerant. While many routing algorithms are able to tolerate a large number of faults enclosed by rectangular faulty blocks, there is no existing algorithm that is capable of handling irregular faulty patterns for wormhole networks. In this paper, a two-staged adaptive and deadlock-free routing algorithm called “Routing for Irregular Faulty Patterns” (RIFP) is proposed. It can tolerate irregular faulty patterns by transmitting messages from sources or to destinations within faulty blocks via multiple “intermediate nodes.” A method employed by RIFP is first introduced to generate intermediate nodes using the local failure information. By its aid, two communicating nodes can always exchange their data or intermediate results if there is at least one path between them. RIFP needs two virtual channels per physical link in meshes  相似文献   

18.
Consideration is given to fault tolerant systems that are built from modules called fault tolerant basic blocks (FTBBs), where each module contains some primary nodes and some spare nodes. Full spare utilization is achieved when each spare within an FTBB can replace any other primary or spare node in that FTBB. This, however, may be prohibitively expensive for larger FTBBs. Therefore, it is shown that for a given hardware overhead more reliable systems can be designed using bigger FTBBs without full spare utilization than using smaller FTBBs with full spare utilization. Sufficient conditions for maximizing the reliability of a spare allocation strategy in an FTBB for a given hardware overhead are presented. The proposed spare allocation strategy is applied to two fault tolerant reconfiguration schemes for binary hypercubes. One scheme uses hardware switches to replace a faulty node, and the other scheme uses fault tolerant routing to bypass faulty nodes in the system and deliver messages to the destination node  相似文献   

19.
向东  陈爱  孙家广 《计算机学报》2004,27(5):611-618
当系统包含很少的故障点时.mesh/torus网整个系统就有可能是不可靠的.该文采用扩展的局部可靠性信息来指导三维mesh/torus网的容错路由.扩展的局部可靠性信息在每个平面内部对无故障节点分类,所以系统中的故障块也是在不同的平面上构成的,而不是基于整个系统.很多基于整个系统不可靠的节点在二维的平面中都会变成可靠的节点.不管是在可靠的系统内,甚或不可靠的系统内,扩展的局部可靠性信息都能有效地指导容错路由.不同于以往的方法,作者的方法不会将任何无故障节点设置为无效节点.所有的故障块都是在平面内构成的,而不是基于整个系统;在一个平面内.任何包含在故障块里的无故障节点仍然可作为出发点或者目标点,这样将大大提高系统的计算能力和性能.模拟结果表明该文方法大大优于已有的方法.  相似文献   

20.
针对传统摆渡路由中使者调度和协作的问题,设计一种交叉区域的多使者摆渡路由协议.将网络划分成若干横向区域和纵向区域,每个区域内存在一个使者轮询节点.通过单个使者或一个横向区域使者与一个纵向区域使者的协作实现数据的传递.从理论上分析了提出协议的期望延时,并从延时和容错性两个方面对协议进行了改善.仿真评估结果表明,交叉区域摆渡路由在平衡网络负载和端到端的延时的同时,具有单一使者的容错能力,是一种合理有效的多使者调度方法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号