共查询到20条相似文献,搜索用时 15 毫秒
1.
With nowadays popularity of large-scale parallel computers, Multiprocessors System-on-Chip (MP-SoCs), multicomputers, cluster
computers and peer-to-peer communication networks, fault-tolerant routing becomes an important issue in developing these systems.
Fault-tolerant routing algorithms in such systems aim at providing continuous operations in the presence of one or more failures
by allowing the graceful degradation of system. The Software-Based fault-tolerant routing scheme has been suggested as an
efficient routing algorithm to preserve both communication performance and fault-tolerant demands in parallel computer systems.
To study network performance, a number of different analytical models for fault-free routing algorithms have been proposed
in the past literature. However, there has not been reported any similar analytical model of fault-tolerant routing in the
presence of faulty components. This paper presents a new analytical modeling approach for determining the effects of failures
in wormhole-switched 2-D tori using the fault-tolerant Software-Based scheme. More specifically, we describe a general model
to derive mathematical expressions to investigate the performance behavior of routing algorithms confronting convex (|-shaped,
□-shaped) or concave (U-shaped, +-shaped, T-shaped, H-shaped) faulty regions. The model is validated through comprehensive
simulation experiments for different types of failures.
相似文献
M. Ould-KhaouaEmail: |
2.
A torus network has become increasingly important to multicomputer design because of its many features including scalability, low bandwidth and fixed degree of nodes. A multicast communication is a significant operation in multicomputer systems and can be used to support several other collective communication operations. This paper presents an efficient algorithm, TTPM, to find a deadlock-free multicast wormhole routing in two-dimensional torus parallel machines. The introduced algorithm is designed such that messages can be sent to any number of destinations within two start-up communication phases; hence the name Torus Two Phase Multicast (TTPM) algorithm. An efficient routing function is developed and used as a basis for the introduced algorithm. Also, TTPM allows some intermediate nodes that are not in the destination set to perform multicast functions. This feature allows flexibility in multicast path selection and therefore improves the performance. Performance results of a simulation study on torus networks are discussed to compare TTPM algorithm with a previous algorithm. 相似文献
3.
An effective routing algorithm in incomplete hypercubes 总被引:1,自引:0,他引:1
An incomplete hypercube appears interesting and practical because of its relaxed restriction on the system size and possession of salient properties of complete hypercubes. The performance of incomplete hypercubes can be improved considerably by reducing communication time, which can be achieved by forwarding messages through two parallel paths between a pair of nodes. This paper presents a simple and effective two-parallel-paths routing algorithm for incomplete hypercubes which takes advantage of the flexibility provided by incomplete hypercubes, and yet prevents traffic congestion and deadlock. Simulation results indicate that the mean latency for sending large sized messages is reduced and the degree of reduction becomes larger when the system load grows. This significant reduction in latency could translate to a respectable performance improvement. This algorithm can also tolerate one fault in the system by sending duplicate copies of messages through two parallel paths with little increase in the mean latency under light-traffic load. 相似文献
4.
The ability to tolerate faults is critical in multicomputer employing large numbers of processors. This paper describes a class of fault-tolerant routing algorithms for n-dimensional meshes that can tolerate large numbers of faults without using virtual channels. We show that these routing algorithms prevent livelock and deadlock while remaining highly adaptive. 相似文献
5.
Message routing is a fundamental function of a network, and fault-tolerance is an important tool to ensure the quality of service of a network. Assume that the network contains at most one faulty element and the algorithm does not know the faulty element in advance. We present an optimal fault-tolerant message routing algorithm for double-loop networks. We show that sending at most two messages with different routing strategies can ensure that one of the messages will be sent through a shortest path that avoids the faulty element. At each vertex, for any destination, the algorithm needs only constant time and space to determine the next vertex to which the message is to be sent. 相似文献
6.
Claudia RusuAuthor Vitae Lorena AnghelAuthor Vitae 《Microprocessors and Microsystems》2011,35(7):613-631
Existing routing algorithms for 3D deal with regular mesh/torus 3D topologies. Today 3D NoCs are quite irregular, especially those with heterogeneous layers. In this paper, we present a routing algorithm targeting 3D networks-on-chip (NoCs) with incomplete sets of vertical links between adjacent layers. The routing algorithm tolerates multiple link and node failures, in the case of absence of NoC partitioning. In addition, it deals with congestion. The routing algorithm for 3D NoCs preserves the deadlock-free propriety of the chosen 2D routing algorithms. It is also scalable and supports a local reconfiguration that complements the reconfiguration of the 2D routing algorithms in case of failures of nodes or links. The algorithm incurs a small overhead in terms of exchanged messages for reconfiguration and does not introduce significant additional complexity in the routers. Theoretical analysis of the 3D routing algorithm is provided and validated by simulations for different traffic loads and failure rates. 相似文献
7.
无线传感器网络(WSNs)寿命受到电池能量的制约,利用无线能量传输技术对传感器节点进行充电,可以解决无线传感器网络的能量问题.以三维无线传感器网络为研究对象,证明三维最短Hamilton回路为无线充电设备遍历网络中节点的最优路径,提出了网络的连续时变模型,并简化复杂度为多项式的离散T+1阶段线性规划模型.仿真结果表明:通过运算离散T+1阶段线性规划模型能够使无线传感器网络持续运行. 相似文献
8.
Chris JacksonAuthor Vitae Simon J. Hollis Author Vitae 《Microprocessors and Microsystems》2011,35(2):139-151
We address routing in Networks-On-Chip (NoC) architectures that use irregular mesh topologies with Long-Range Links (LRL). These topologies create difficult conditions for routing algorithms, as standard algorithms assume a static, regular link structure and exploit the uniformity of regular meshes to avoid deadlock and maintain routability. We present a novel routing algorithm that can cope with these irregular topologies and adapt to run-time LRL insertion and topology reconfiguration. Our approach to accommodate dynamic topology reconfiguration is to use a new technique that decomposes routing relations into two stages: the calculation of output ports on the current minimal path and the application of routing restrictions designed to prevent deadlock. In addition, we present a selection function that uses local topology data to adaptively select optimal paths.The routing algorithm is shown to be deadlock-free, after which an analysis of all possible routing decisions in the region of an LRL is carried out. We show that the routing algorithm minimises the cost of sub-optimally placed LRL and display the hop savings available. When applied to LRLs of less than seven hops, the overall traffic hop count and associated routing energy cost is reduced. In a simulated 8 × 8 network the total input buffer usage across the network was reduced by 6.5%. 相似文献
9.
A parallel evolutionary algorithm for the vehicle routing problem with heterogeneous fleet 总被引:2,自引:0,他引:2
Luiz S. Ochi Dalessandro S. Vianna Lúcia M. A. Drummond AndrO. Victor 《Future Generation Computer Systems》1998,14(5-6):285-292
Nowadays genetic algorithms stand as a trend to solve NP-complete and NP-hard problems. In this paper, we present a new hybrid metaheuristic which uses parallel genetic algorithms and scatter search coupled with a decomposition-into-petals procedure for solving a class of vehicle routing and scheduling problems. The parallel genetic algorithm presented is based on the island model and its performance is evaluated for a heterogeneous fleet problem, which is considered a problem much harder to solve than the homogeneous vehicle routing problem. 相似文献
10.
The torus routing chip 总被引:8,自引:0,他引:8
The torus routing chip (TRC) is a selftimed chip that performs deadlock-freecut-through routing ink-aryn-cube multiprocessor interconnection networks using a new method of deadlock avoidance calledvirtual channels. A prototype TRC with byte wide self-timed communication channels achieved on first silicon a throughput of 64 Mbits/s in each dimension, about an order of magnitude better performance than the communication networks used by machines such as the Caltech Cosmic Cube or Intel iPSC. The latency of the cut-through routing of only 150 ns per routing step largely eliminates message locality considerations in the concurrent programs for such machines. The design and testing of the TRC as a self-timed chip was no more difficult than it would have been for a synchronous chip.
Bill Dally received his B. S. degree in Electrical Engineering from the Virginia Polytechnic Institute in 1980 and his M.S. degree in Electrical Engineering from Stanford University in 1981. From 1980 to 1982 he worked at Bell Telephone Laboratories, where he contributed to the design of the BELLMAC-32 microprocessor. From 1982 to 1983 he worked as a consultant in the area of digital systems design. Since 1983 he has been a graduate student in Computer Science at Caltech, and is expected to complete his Ph.D. studies in the spring 1986. His current research interests include computer architecture, computer aided design, VLSI, design, and concurrent systems.
Chuck Seitz earned B.S., M.S., and Ph.D. degrees from M.I.T. Before joining the Computer Science faculty at Caltech in 1977, he worked as a member of the technical staff of the Evans & Sutherland Computer Corporation from 1969 to 1971, as an Assistant Professor of Computer Science at the University of Utah from 1970 to 1972, and as a consultant to Burroughs Corporation from 1971 to 1978. He is currently a Professor of Computer Science at Caltech, where his research and teaching activities are in the areas of VLSI architecture and design, concurrent computation, and self-timed systems.The research described in this paper was sponsored in part by the Defense Advanced Research Projects Agency, ARPA Order number 3771, and monitored by the Office, of Naval Research under contract number N 00014-79-C-0597, in part by Intel Corporation, and in part by an AT & T Ph.D. fellowship 相似文献
11.
In this paper, we present a scalable three-dimensional hybrid parallel Delaunay image-to-mesh conversion algorithm (PDR.PODM) for distributed shared memory architectures. PDR.PODM is able to explore parallelism early in the mesh generation process thanks to the aggressive speculative approach employed by the Parallel Optimistic Delaunay Mesh generation algorithm (PODM). In addition, it decreases the communication overhead and improves data locality by making use of a data partitioning scheme offered by the Parallel Delaunay Refinement algorithm (PDR). PDR.PODM supports fully functional volume grading by creating elements with varying size. Small elements are created near boundary or inside the critical regions in order to capture the fine features while big elements are created in the rest of the mesh. We tested PDR.PODM on Blacklight, a distributed shared memory (DSM) machine in Pittsburgh Supercomputing Center. For the uniform mesh generation, we observed a weak scaling speedup of 163.8 and above for up to 256 cores as opposed to PODM whose weak scaling speedup is only 44.7 on 256 cores. PDR.PODM scales well on uniform refinement cases running on DSM supercomputers. The end result is that PDR.PODM can generate 18 million elements per second as opposed to 14 million per second in our earlier work. The varying size version sharply reduces the number of elements compared to the uniform version and thus reduces the time to generate the mesh while keeping the same fidelity. 相似文献
12.
13.
三维无线传感器网络贪婪地理路由协议 总被引:1,自引:0,他引:1
针对已有的三维无线传感器网络路由协议均将三维问题转换到二维平面上解决,没有充分利用三维空间的特点,提出了一种基于地理信息的三维无线传感器网络贪婪地理路由(GGR)协议。该协议在通常状况下采用贪婪转发算法,针对基于地理信息的贪婪算法中易出现的路由空洞问题,给出一种三维滚动球边界遍历算法。实验结果表明:该协议具有较高的路由成功率,路由跳数也得到优化,尤其是在节点密度较低的情况下体现出较强的优势。 相似文献
14.
Nadia Nedjah Luneque Silva Junior Luiza de Macedo Mourelle 《Expert systems with applications》2013,40(16):6661-6673
Networks-on-Chip (NoC) is an interesting option in design of communication infrastructures for embedded systems. It provides a scalable structure and balanced communication between the cores. Parallel applications that take advantage of the NoC architectures, are usually are communication-intensive. Thus, a big deal of data packets is transmitted simultaneously through the network. In order to avoid congestion delays that deteriorate the execution time of the implemented applications, an efficient routing strategy must be thought of carefully. In this paper, the ant colony optimization paradigm is explored to find and optimize routes in a mesh-based NoC. The proposed routing algorithms are simple yet efficient. The routing optimization is driven by the minimization of total latency during packets transmission between the tasks that compose the application. The presented performance evaluation is threefold: first, the impact of well-known synthetic traffic patterns is assessed; second, randomly generated applications are mapped into the NoC infrastructure and some synthetic communication traffics, that follow known patterns, are used to simulate real situations; third, sixteen real-world applications of the E3S and one specific application for digital image processing are mapped and their execution time evaluated. In both cases, the obtained results are compared to those obtained with known general purpose algorithms for deadlock free routing. The comparison avers the effectiveness and superiority of the ant colony inspired routing. 相似文献
15.
16.
Youngjoo Cho 《Information Sciences》2006,176(21):3132-3142
Double-loop [J. Bermond, F. Comellas, D. Hsu, Distributed Loop Computer Networks: A Survey, J. Parallel and Distributed Computing, Academic Press, 24 (1995) 2-10] and 2-circulant networks (2-CN) [J. Park, Cycle Embedding of Faulty Recursive Circulants, J. of Korea Info. Sci. Soc. 31 (2) (2004) 86-94] are widely used in the design and implementation of local area networks and parallel processing architectures. In this paper, we investigate the routing of a message on circulant networks, that is a key to the performance of this network. We would like to transmit 2k packets from a source node to a destination node simultaneously along paths on G(n; ±s1, ±s2, … , ±sk), where the ith packet traverses along the ith path (1 ? i ? 2k). In order for all packets to arrive at the destination node quickly and securely, the ith path must be node-disjoint from all other paths. For construction of these paths, employing the Hamiltonian circuit latin square (HCLS), a special class of (n × n) matrices, we present O(n2) parallel routing algorithm on circulant networks. 相似文献
17.
Hypercube interconnection networks have been receiving considerable attention in the supercomputing environment. However, the number of processors must be exactly 2r for an r-cube complete hypercube. This restriction severely limits its applicability. In this paper, we address three variant hypercube topologies with more flexibility in system sizes, the labelled hypercubes Imr, IMr, and IAr. Incomplete hypercube Imr consists of an r-cube and an m-cube complete hypercubes; Imr is composed of 2r and Σm ε M 2m nodes; IAr comes from an r-cube complete hypercube which operates in a degraded manner and allows that the missing nodes to be arbitrarily distributed. Specifically, we focus on the parallel paths routing algorithms for these three classes of incomplete hypercubes. Parallel paths between any given two nodes mean that these paths have the same source and destination nodes but with different intermediate nodes. Parallel communication is important as it will allow us to use the full bandwidth of the multiprocessors for the data transfer operation between any two nodes, and3these redundant paths can increase system fault-tolerance and communication reliability. With these parallel routing algorithms, one can use them as a criterion to design multiprocessor systems. 相似文献
18.
Jau-Der Shih 《Information Processing Letters》2003,88(6):271-278
In this paper, we present a fault-tolerant routing algorithm for torus networks by using only 4 virtual channels. The proposed algorithm is based on the solid fault model, which includes rectangular faults and many practical nonconvex faults. Previous works need at least 6 virtual channels to achieve the same fault-tolerant ability. 相似文献
19.
Geometric routing by using virtual locations is an elegant way for solving network routing problems. In its simplest form, greedy routing, a message is simply forwarded to a neighbor that is closer to the destination. One main drawback of this approach is that the coordinates of the virtual locations require Ω(nlogn) bits to represent, which makes this scheme infeasible in some applications.The essence of the geometric routing is the following: When an origin vertex u wants to send a message to a destination vertex w, it forwards the message to a neighbor t, solely based on the location information of u,w and all neighbors of u. In the greedy routing scheme, the decision is based on decreasing distance. For this idea to work, however, the decision needs not be based on decreasing distance. As long as the decision is made locally, this scheme will work fine.In this paper, we introduce a version of greedy routing which we call generalized greedy routing algorithm. Instead of relying on decreasing distance, a generalized greedy routing algorithm uses other criteria to determine routing paths, solely based on local information. We present simple generalized greedy routing algorithms based on st-coordinates (consisting of two integers between 0 and n−1), which are derived from an st-orientation of a 2-connected plane graph. We also generalize this result to arbitrary trees. Both algorithms are natural and simple to be implemented. 相似文献