期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Simulation of Complete Binary Tree Structures in a Faulty Flexible Hypercube

Huan-Chao Keh Jen-Chih Lin 《Journal of scientific computing》1999,14(3):245-258

The Flexible Hypercube is a generalization of binary hypercube networks in that the number of nodes can be arbitrary in contrast to a strict power of 2. Restated, the Flexible Hypercube retains the connectivity and diameter properties of the corresponding hypercube. Although the embedding of complete binary trees in faulty hypercubes has received considerable attention, to our knowledge, no paper has demonstrated how to embed a complete binary tree in a faulty Flexible Hypercube. Therefore, this investigation presents a novel algorithm to facilitate the embedding job when the Flexible Hypercube contains faulty nodes. Of particular concern are the network structures of the Flexible Hypercube that balance the load before as well as after faults start to degrade the performance of the Flexible Hypercube. Furthermore, to obtain the replaceable node of the faulty node, 2-expansion is permitted such that up to (n – 2) faults can be tolerated with congestion 1, dilation 4 and load 1. That is, (n – 1) is the dimension of a Flexible Hypercube. Results presented herein demonstrate that embedding methods are optimized. 相似文献

2.

双环Petersen图互联网络及路由算法 总被引：5，自引：0，他引：5

王雷林亚平夏巍《软件学报》2006,17(5):1115-1123

Petersen图由于具有短直径和正则性等特性,因此在并行与分布式计算中具有良好的性能.基于双环结构,构造了一个双环Petersen图互联网络DLCPG(k).同时,分别设计了DLCPG(k)上的单播、广播和容错路由算法.证明了DLCPG(k)不但具有良好的可扩展性、短的网络直径和简单的拓扑结构等特性,而且对于10k个节点组成的互联网络,DLCPG(k)还具有比二维Torus以及RP(k)互联网络更小的直径和更优越的可分组性.另外,还证明了其上的单播、广播路由算法的通信效率与RP(k)上的单播和广播路由算法的通信效率相比均有明显的提高.仿真实验表明,新的容错路由算法也具有良好的容错性能. 相似文献

3.

VTFTR：高维胖树中的无死锁容错路由算法

刘博阳胡舒凯施得君卢宏生《计算机工程》2022,48(12):38

随着近年来高性能计算系统规模的急剧扩大,高性能互连网络的可靠性成为愈发重要的问题。高维胖树是一种结合了胖树与多维环网优点的网络拓扑结构,凭借其良好的可扩展性与网络性能在E级时代具有广阔的应用前景。然而,目前关于高维胖树中容错路由算法的相关研究较为有限,其可靠性问题亟待解决。为提高高维胖树拓扑在高性能互连网络中的容错能力,进一步提高对应超算系统的运行效率,提出一种用于高维胖树中叶交换机故障的容错路由算法VTFTR。该算法结合转向模型与虚通道切换的思想,通过严格控制报文在无故障路径与容错路径中的转向,使用少量的容错虚通道与额外跳步实现高维胖树中的无死锁容错。实验结果表明,在单点故障情况下,VTFTR算法的容错路径较对比算法有2~4个跳步的减少,在4 096个节点规模的网络中,当叶交换机故障数量为10时,在故障叶交换机不同的分布情况下,该算法能够以1.4%~2.0%的吞吐率下降作为代价来保持全网无故障节点之间的互连。相似文献

4.

超立方体系统中基于安全通路向量的容错路由 总被引：10，自引：1，他引：10

下载免费PDF全文

王雷林亚平陈治平文学《软件学报》2004,15(5):783-790

n维超立方体结构的多处理机系统在并行与分布式处理中具有良好的性能.随着多处理机系统规模的增大,系统出现链路与节点故障的概率也随之增大,因此设计容错性更强的路由算法对n维超立方体结构的多处理机系统具有重要意义.针对系统中存在链路故障的情况,提出了用于记录最优通路的安全通路向量(safety path vectors,简称SPVs)概念,并给出了建立SPVs及其容错路由算法.其中SPVs的赋值可以通过n-1轮邻节点之间的信息交换来完成,且算法中各节点的存储开销仅为n bits,因此,SPVs是安全向量(SVs)与扩展安全向量(ESVs)的一种扩展,具有比SVs和ESVs更好的记录最优通路的能力.另外，与基于最优通路矩阵(optimal path matrices,简称OPMs)及扩展最优通路矩阵(extended optimal path matrices,简称EOPMs)的容错路由算法相比,SPVs呈指数级地降低了算法的存储开销，且能够记录OPMs和EOPMs所不能记录到的最优通路信息.理论分析和仿真实验验证了SPVs的上述性能. 相似文献

5.

FRoots: A Fault Tolerant and Topology-Flexible Routing Technique

Theiss I. Lysne O. 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(10):1136-1150

Existing solutions for fault-tolerant routing in interconnection networks either work for only one given regular topology, or require slow and costly network reconfigurations that do not allow full and continuous network access. In this paper, we present FRroots, a routing method for fault tolerance in topology-flexible network technologies. Our method is based on redundant paths, and can handle single dynamic faults without sending control messages other than those that are needed to inform the source nodes of the failing component. Used in a modus with local rerouting, the source nodes need not be informed and no control messages are necessary for the network to stay connected despite of a single fault. In fault-free networks under nonuniform traffic our routing method performs comparable to, or even better than, topology specific routing algorithms in regular networks like meshes and tori. FRoots does not require any other features in the switches or end nodes than a flexible routing table, and a modest number of virtual channels. For that reason, it can be directly applied to several present day technologies like InfiniBand and Advanced Switching. 相似文献

6.

Fault-Tolerant Convergence Routing

Bülent Yener Inderpal Bhandari Yoram Ofek Moti Yung 《Journal of Parallel and Distributed Computing》1997,42(2):989

This paper presents fault-tolerant protocols for fast packet switch networks withconvergence routing. The objective is to provide fast reconfiguration and continuous host-to-host communication after a link or a node (switch) failure,Convergence routingcan be viewed as a variant ofdeflection routing,which combines, in a dynamic fashion, the on-line routing decision with the traffic load inside the network. Unlike other deflection techniques, convergence routing operates withglobal sense of directionand guarantees that packets will reach or converge to their destinations. Global sense of direction is achieved by embedding of virtual rings to obtain a linear ordering of the nodes. We consider virtual ring embeddings over (i) a single spanning tree, and (ii) over two edge-disjoint spanning trees. Thus, the fault-tolerant solution is based on spanning trees and designed for a switch-based (i.e., arbitrary topology) architecture called MetaNet. In this work, the original MetaNet's convergence routing scheme has been modified in order to facilitate the property that the packet header need not be recomputed after a failure and/or a reconfiguration. This is achieved by having, at the network interface, a translator that maps the unique destination address to a virtual address. It is argued that virtual rings embedded over two-edge disjoint spanning trees increase the fault tolerance for both node and link faults and provides continuous host-to-host communication. 相似文献

7.

二维网格中基于最小通路区的自适应和最小容错路由算法

陈文斌杨小帆苏伟唐荣旺曾智《计算机科学》2006,33(7):292-294

网格结构是并行与分布式处理中最流行的一种网络拓扑结构。在存在故障的情况下，如何设计具有最优性的容错路由算法一直是研究的热，点问题。本文研究了采用故障块模型的二维网格的最小路由问题，提出存在最小通路的一个充分必要条件。基于最小通路区（RMP）的概念，提出一种自适应的最小容错路由算法。如果源节点和目的节点之间存在最小通路区，则在最小通路区中进行自适应最小容错路由；反之，则采用多阶段最小容错路由。主要思想就是在存在故障的情况下，尽量保证路由算法能走最短路径。因为只要求知道每个节点的局部信息，故算法是分布式的。相似文献

8.

An optimal fault-tolerant routing algorithm for weighted bidirectional double-loop networks

Dharmasena H.P. Yan X. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(9):841-852

Double-loop networks are widely used in computer networks. In this paper, we present an optimal message routing algorithm and an optimal fault-tolerant message routing algorithm for weighted bidirectional double-loop networks. The algorithms presented are novel, and they do not use routing tables. After a precalculation of O(log N) steps to determine network parameters, the algorithms can route messages using constant time at each node along the route. The algorithm presented can route messages in the presence of up to three faulty nodes or links. The fault-tolerant routing algorithm guarantees an optimal route in the presence of one node failure. 相似文献

9.

Fault-Tolerant Wormhole Routing with 2 Virtual Channels in Meshes

下载免费PDF全文

Ji-Peng Zhou 《计算机科学技术学报》2005,20(6):822-830

In wormhole meshes, a reliable routing is supposed to be deadlock-free and fault-tolerant. Many routing algorithms are able to tolerate a large number of faults enclosed by rectangular blocks or special convex, none of them, however, is capable of handling two convex fault regions with distance two by using only two virtual networks. In this paper, a fault-tolerant wormhole routing algorithm is presented to tolerate the disjointed convex faulty regions with distance two or no less, which do not contain any nonfaulty nodes and do not prohibit any routing as long as nodes outside faulty regions are connected in the mesh network. The processors＇ overlapping along the boundaries of different fault regions is allowed. The proposed algorithm, which routes the messages by X-Y routing algorithm in fault-free region, can tolerate convex fault-connected regions with only two virtual channels per physical channel, and is deadlock- and livelock-free. The proposed algorithm can be easily extended to adaptive routing. 相似文献

10.

Torus网络中容错路由算法的设计与概率分析

下载免费PDF全文

李银梁家荣徐霜肖杰《计算机工程与应用》2009,45(14):103-106

基于k-Torus子网的概念提出了一个简单的Torus网络容错路由算法。假设结点出错相互独立,计算出路由算法成功路由的概率。对于几十万个结点以上的Torus网络,提出的路由算法构造通路的概率可达99%,且所提出的路由算法具有线性的特点。相似文献

11.

Fault-tolerant routing over shortest node-disjoint paths in hypercubes

Lokendra Singh Umrao Ravi Shankar Singh 《International Journal of Parallel, Emergent and Distributed Systems》2016,31(3):294-304

This paper presented a routing algorithm that finds n disjoint shortest paths from the source node s to target node d in the n-dimensional hypercube. Fault-tolerant routing over all shortest node-disjoint paths has been investigated to overcome the failure encountered during routing in hypercube networks. In this paper, we proposed an efficient approach to provide fault-tolerant routing which has been investigated on hypercube networks. The proposed approach is based on all shortest node-disjoint paths concept in order to find a fault-free shortest path among several paths provided. The proposed algorithm is a simple uniform distributed algorithm that can tolerate a large number of process failures, while delivering all n messages over optimal-length disjoint paths. However, no distributed algorithm uses acknowledgement messages (acks) for fault tolerance. So, for dealing the faults, acknowledgement messages (acks) are included in the proposed algorithm for routing messages over node-disjoint paths in a hypercube network. 相似文献

12.

Two-Pass Rearrangeability in Faulty Benes Networks

Nabanita Das Jayasree Dattagupta 《Journal of Parallel and Distributed Computing》1996,35(2):191

Existing fault-tolerant routing schemes for Benes networks either consider only the control line stuck-at faults, or handle the switch faults by some graceful degradation routing schemes that reconfigure the network into a smaller system with minimal loss. Now, even in the presence of a single switch fault in anN×NBenes networkB(n), (n= log₂N), noN×Npermutation can be realized in a single pass. In this paper, we attempt to characterize the switch fault sets inB(n), in the presence of which the network is always capable of realizing any arbitraryN×NpermutationPin two passes, such that any source–destination path is set up in a single pass, no recirculation is needed, but the whole set ofNsource–destination paths ofPis partitioned in two subsets and are realized in two successive passes. We propose an algorithm that will detect if the switch fault set present in aB(n), belongs to this class; if it is yes, we present another algorithm that computes the fault-tolerant routing to realize any arbitrary permutationPin two passes. This scheme enables us to makeB(n) fault-tolerant in the presence of a restricted class of multiple switch faults, without any recirculation through intermediate nodes, or any reconfiguration of the system. 相似文献

13.

Study of General Icomplete Star Interconnection Networks

下载免费PDF全文

史云涛侯紫峰等《计算机科学技术学报》2002,17(3):356-361

The star networks,which were originally proposed by Akers and Harel,have suffered from a rigorous restriction on the number of nodes.The general incomplete star networks(GISN) are proposed in this paper to relieve this restriction.An efficient labeling scheme for GISN is given,and routing and broadcasting algorithms are also presented for GIS.The communication diameter of GISN is shown to be bounded by 4n-7.The proposed single node broadcasting algorithm is optimal with respect to time complexity O(nlog2n). 相似文献

14.

Optimal Broadcasting in Faulty Trees

《Journal of Parallel and Distributed Computing》2000,60(5):566-584

We consider broadcasting a message from one node of a tree to all other nodes. In the presence of up to k link failures the tree becomes disconnected, and only nodes in the connected component C containing the source can be informed. The maximum ratio between the time used by a broadcasting scheme B to inform C and the optimal time to inform C, taken over all components C yielded by configurations of at most k faults, is the k-vulnerability of B. This is the maximum slowdown incurred by B due to the lack of a priori knowledge of fault location, for at most k faults. This measure of fault tolerance is similar to the competitive factor of on-line algorithms: in both cases, the performance of an algorithm lacking some crucial information is compared to the performance of an “off-line” algorithm, one that is given this information as input. It is also the first known tool to measure and compare fault tolerance of broadcasting schemes in trees. We seek broadcasting schemes with low vulnerability, working for tree networks. It turns out that schemes that give the best broadcasting time in a fault-free environment may have very high vulnerability, i.e., poor fault tolerance, for some trees. The main result of this paper is an algorithm that, given an arbitrary tree T and an integer k, computes a broadcasting scheme B with lowest possible k-vulnerability among all schemes working for T. Our algorithm has running time O(kn²+n² log n), where n is the size of the tree. We also give an algorithm to find a “universally fault-tolerant” broadcasting scheme in a tree T: one that approximates the lowest possible k-vulnerability, for all k simultaneously. 相似文献

15.

One-to-one communication in twisted cubes under restricted connectivity

Jianxi FAN Kenli LI Shukui ZHANG Wujun ZHOU Baolei CHENG 《Frontiers of Computer Science in China》2010,4(4):489-499

The dimensions of twisted cubes are only limited to odd integers. In this paper, we first extend the dimensions of twisted cubes to all positive integers. Then, we introduce the concept of the restricted faulty set into twisted cubes. We further prove that under the condition that each node of the n-dimensional twisted cube TQ _n has at least one fault-free neighbor, its restricted connectivity is 2n − 2, which is almost twice as that of TQ _n under the condition of arbitrary faulty nodes, the same as that of the n-dimensional hypercube. Moreover, we provide an O(NlogN) fault-free unicast algorithm and simulations result of the expected length of the fault-free path obtained by our algorithm, where N denotes the node number of TQ _n. Finally, we propose a polynomial algorithm to check whether the faulty node set satisfies the condition that each node of the n-dimensional twisted cube TQ _n has at least one fault-free neighbor. 相似文献

16.

基于NPV广义超立方体最佳容错路由算法

田绍槐陆应平张大方《软件学报》2007,18(7):1818-1830

在网络可靠性研究中,设计较好的容错路由策略、尽可能多地记录系统中最优通路信息,一直是一项重要的研究工作.超立方体系统的容错路由算法分为可回溯算法和无回溯算法.一般说来,可回溯算法的优点是容错能力强:只要消息的源节点和目的节点有通路,该算法就能够找到把消息传递到目的地的路径;其缺点是在很多情况下传递路径不能按实际存在的最短路径传递.其代表是深度优先搜索(DFS)算法.无回溯算法是近几年人们比较关注的算法.该算法通过记录各邻接节点的故障信息,给路由算法以启发信息,使消息尽可能按实际存在的最短路径传递.这些算法的共同缺点是只能计算出Hamming距离不超过n的路由.在n维超立方体系统连通图中,如果系统存在大量的故障,不少节点对之间的最短路径大于n,因此,这些算法的容错能力差.提出了一个实例说明采用上述算法将遗失60%的路由信息.另外,由于超立方体的结构严格,实际中的真正超立方体系统不多.事实上,不少的网络系统可转换为具有大量错误节点和错误边的超立方体系统.因此,研究能适应具有大量错误节点和错误边的超立方体系统的容错路由算法是一个很有实际价值的工作.研究探讨了:(1) 定义广义超立方体系统;(2) 在超立方体系统中提出了节点通路向量(NPV)概念及其计算规则;(3) 提出了中转点技术,使得求NPV的计算复杂度降低到O(n);(4) 提出了基于NPV的广义超立方体系统最佳容错路由算法(OFTRS),该算法是一种分布式的和基于相邻节点信息的算法.由于NPV记录了超立方体系统全部最优通路和次最优通路的信息,在具有大量故障的情况下,它不会遗漏任何一条最优通路和次最优通路信息,从而实现了高效的容错路由.在这一点上,它优于其他算法. 相似文献

17.

Deterministic broadcasting in ad hoc radio networks

Bogdan S. Chlebus Leszek Gasieniec Alan Gibbons Andrzej Pelc Wojciech Rytter 《Distributed Computing》2002,15(1):27-38

We consider the problem of distributed deterministic broadcasting in radio networks of unknown topology and size. The network is synchronous. If a node u can be reached from two nodes which send messages in the same round, none of the messages is received by u. Such messages block each other and node u either hears the noise of interference of messages, enabling it to detect a collision, or does not hear anything at all, depending on the model. We assume that nodes know neither the topology nor the size of the network, nor even their immediate neighborhood. The initial knowledge of every node is limited to its own label. Such networks are called ad hoc multi-hop networks. We study the time of deterministic broadcasting under this scenario. For the model without collision detection, we develop a linear-time broadcasting algorithm for symmetric graphs, which is optimal, and an algorithm for arbitrary n-node graphs, working in time . Next we show that broadcasting with acknowledgement is not possible in this model at all. For the model with collision detection, we develop efficient algorithms for broadcasting and for acknowledged broadcasting in strongly connected graphs. Received: January 2000 / Accepted: June 2001 相似文献

18.

A fast pessimistic diagnosis algorithm for generalized hypercube multicomputer systems

Dyi-Rong Duh Chien-Hong Chen Keh-Ning Chang 《The Journal of supercomputing》2012,61(3):605-618

The reliability of processors is an important issue for designing a massively parallel processing system for which fault-tolerant computing is crucial. In order to achieve high system reliability and availability, a faulty processor (node) when found should be replaced by a fault-free processor. Within a multiprocessor system, the technique of identifying faulty nodes by constructing tests on the nodes and interpreting the test outcomes is known as system-level diagnosis. The topological structure of a multicomputer system can be modeled by a graph of which the vertices and edges correspond to nodes and links of the system, respectively. This work presents a system-level diagnosis algorithm for a generalized hypercube which is an attractive variance of a hypercube. The proposed algorithm is based on the PMC model and can isolate all faulty nodes to within a set which contains at most one fault-free node. If the total number of nodes to be diagnosed in a generalized hypercube is N, the proposed algorithm can run in O(Nlog?N) time, and being superior to Yang??s algorithm proposed in 2004, it can diagnose not only a hypercube but also a generalized hypercube. 相似文献

19.

Fault-Tolerant Broadcasting in 2-D Wormhole-Routed Meshes

Jiang Zhen Wu Jie 《The Journal of supercomputing》2003,25(3):255-275

In this paper, a fault-tolerant broadcast scheme in 2-D meshes with randomly generated faults is provided. This approach is based on an early work on time-step optimal broadcasting in square-shape fault-free 2-D meshes with optimal total communication distance (TCD). An extension to any rectangular-shape fault-free 2-D meshes is first given. The fault block model is used in which all faulty nodes in the system are contained in a set of disjoint blocks. The boundary lines of blocks divide the whole mesh into a set of fault-free polygons and a sequence of rectangular fault-free regions is derived from these polygons. The broadcast process is carried out at two levels: inter-region and intra-region. In the inter-region-level broadcast, the broadcast message is sent from a given source to a special node (called eye [1]) in each rectangular fault-free region. In the intra-region-level broadcast, the extended optimal fault-free broadcast is applied. Some analytical results are given including an upper bound of TCD. 相似文献

20.

Discovering Network Topology in the Presence of Byzantine Faults

Nesterenko Mikhail Tixeuil Sebastien 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(12):1777-1789

We pose and study the problem of Byzantine-robust topology discovery in an arbitrary asynchronous network. The problem is an abstraction of fault-tolerant routing. We formally state the weak and strong versions of the problem. The weak version requires that either each node discovers the topology of the network or at least one node detects the presence of a faulty node. The strong version requires that each node discovers the topology regardless of faults. We focus on noncryptographic solutions to these problems. We explore their bounds. We prove that the weak topology discovery problem is solvable only if the connectivity of the network exceeds the number of faults in the system. Similarly, we show that the strong version of the problem is solvable only if the network connectivity is more than twice the number of faults. We present solutions to both versions of the problem. The presented algorithms match the established graph connectivity bounds. The algorithms do not require the individual nodes to know either the diameter or the size of the network. The message complexity of both programs is low polynomial with respect to the network size. We describe how our solutions can be extended to add the property of termination, handle topology changes, and perform neighborhood discovery. 相似文献