首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper presents a new approach to implement fast barrier synchronization in wormhole k-ary n-cubes. The novelty lies in using multidestination messages instead of the traditional single destination messages. Two different multidestination worm types, gather and broadcasting, are introduced to implement the report and wake-up phases of barrier synchronization, respectively. Algorithms for complete and arbitrary set barrier synchronization are presented using these new worms. It is shown that complete barrier synchronization in a k-ary n-cube system with e-cube routing can be implemented with 2n communication start-ups as compared to 2nlog2k start-ups needed with unicast-based message passing. This leads to an asymptotic improvement by a factor of log2k. Simulation results for different system and architectural parameters indicate that the new framework can reduce barrier synchronization cost considerably compared to the unicast-based scheme. For arbitrary set barrier, an interesting trend is observed where the synchronization cost keeps on reducing beyond a certain number of participating nodes. The framework demonstrates potential for supporting fast barrier synchronization in large wormhole-routed systems.  相似文献   

2.
The evaluation of advanced routing features must be based on both of costs and benefits. To date, adaptive routers have generally been evaluated on the basis of the achieved network throughput (channel utilization), ignoring the effects of implementation complexity. In this paper, we describe a parameterized cost model for router performance, characterized by two numbers: router delay and flow control time. Grounding the cost model in a 0.8 micron gate array technology, we use it to compare a number of proposed routing algorithms. From these design studies, several insights into the implementation complexity of adaptive routers are clear. First, header update and selection is expensive in adaptive routers, suggesting that absolute addressing should be reconsidered. Second, virtual channels are expensive in terms of latency and cycle time, so decisions to include them to support adaptivity or even virtual lanes should not be taken lightly. Third, requirements of larger crossbars and more complex arbitration cause some increase in the complexity of adaptive routers, but the rate of increase is small. Last, the complexity of adaptive routers significantly increases their setup delay and flow control cycle times, implying that claims of performance advantages in channel utilization and low load latency must be carefully balanced against losses in achievable implementation speed  相似文献   

3.
《Performance Evaluation》2006,63(4-5):423-440
Several analytical models of fully adaptive routing in wormhole-routed networks have recently been reported in the literature. All these models, however, have been discussed for routing algorithms with deadlock avoidance. Recent studies have revealed that deadlocks are quite rare in the network, especially when enough routing freedom is provided. Thus, the hardware resources, e.g. virtual channels, dedicated for deadlock avoidance are not utilised most of the time. This consideration has motivated researchers to introduce fully adaptive routing algorithms with deadlock-recovery. This paper proposes a new analytical model to predict message latency in k-ary n-cubes with compressionless routing, a fully adaptive algorithm that uses deadlock-recovery. The proposed model uses results from queueing systems with impatient customers to capture the effects of the timeout mechanism used in this routing algorithm to deal with message deadlock. The validity of the model is demonstrated by comparing results predicted by the analytical model against those obtained through simulation experiments.  相似文献   

4.
《Performance Evaluation》2001,43(2-3):165-179
Many fully adaptive algorithms have been proposed in the literature over the past decade. The performance characteristics of most of these algorithms have been analysed by means of software simulation only. This paper proposes an analytical model to predict message latency in wormhole-routed k-ary n-cubes with fully adaptive routing. The analysis focuses Duato’s fully adaptive routing algorithm [IEEE Trans. Parall. Distrib. Syst. 4 (2) (1993) 320], which is widely accepted as the most general algorithm for achieving adaptivity in wormhole-routed networks while allowing for an efficient router implementation. The proposed model is general in that it exhibits a good degree of accuracy for various network sizes and under different operating conditions.  相似文献   

5.
A spate of deadlock avoidance-based and deadlock recovery-based routing algorithms have been proposed in recent years without full understanding of the likelihood and characteristics of actual deadlocks in interconnection networks. This work models the interrelationships between routing freedom, message blocking, correlated resource dependencies, and deadlock formation. It is empirically shown that increasing routing freedom, as achieved by allowing unrestricted routing over multiple physical and virtual channels, reduces the probability of deadlocks and the likelihood of other types of correlated message blocking that can degrade performance. Moreover, when true fully adaptive routing is used in k-ary n-cube networks with two or more virtual channels (wormhole OF virtual cut-through switched), it is empirically shown that deadlocks are virtually eliminated in networks with n⩾2. These results indicate that deadlocks are very infrequent when the network and routing algorithm inherently provide sufficient routing freedom, thus increasing the viability of deadlock recovery routing strategies  相似文献   

6.
This paper describes a discrete-event simulator designed for the analysis of communication switching techniques adopted in distributed memory multicomputers with asynchronous direct links. The motivation comes from the observation that, even if a wide set of routing strategies have been proposed, an extensive comparison analysis among routing strategies working under different workload characteristics is still lacking in the literature. For these purposes, we have designed and implemented a modular simulator, namely IntNetSim, that solves the models of a wide combination of message passing techniques, routing policies, link conflict resolution strategies, and traffic conditions. For each class of instances, a large variety of simulation runs permits us to select the best performance technique that reduces message transmission time and/or retards saturation of the network under definite workload conditions.  相似文献   

7.
Zhang  Yujie  Fan  Weibei  Han  Zhijie  Song  Yunfei  Wang  Ruchuan 《The Journal of supercomputing》2021,77(11):13090-13114
The Journal of Supercomputing - The 3-ary n-cube network is widely used in large-scale multi-processor parallel computers. It is an important issue to design high-performance communication...  相似文献   

8.
Flow control is considered for M(⩾2) transmitting stations sending packets to a single receiver over a slotted time-multiplexed link. The optimal allocation problem is generalized to the case of nonidentical holding costs at the M transmitters. Qualitative properties of optimal discounted and time-average policies that reduce the computational complexity of the M-dimensional optimal flow control algorithm are derived. For M=2, a simple relationship between optimal allocations for states x and x +ei (i=1,2) that leads to significant computational savings in the optimal algorithm is established  相似文献   

9.
Fault-tolerant systems aim at providing continuous operation in the presence of faults. Multicomputers rely on an interconnection network between processors to support the message-passing mechanism. Therefore, the reliability of the interconnection network is very important for the reliability of the whole system. This paper analyzes the effective redundancy available in a wormhole network by combining connectivity and deadlock freedom. Redundancy is defined at the channel level. We propose a sufficient condition for channel redundancy, also computing the set of redundant channels. The redundancy level of the network is also defined, proposing a theorem that supplies its value. This theory is developed on top of our necessary and sufficient condition for deadlock-free adaptive routing. The new theory also considers the failure of physical channels when virtual channels are used. Finally, we propose a methodology for the design of fault-tolerant routing algorithms, showing its application to n-dimensional meshes  相似文献   

10.
Determining consistent global checkpoints is common to many distributed problems such as fault-tolerance, distributed debugging, properties detection, etc. Uncoordinated and coordinated checkpointing algorithms have been traditionally used for such determinations. This paper addresses a third technique, namely adaptive checkpointing, that has recently emerged. This technique assumes processes take local checkpoints independently and requires them to take additional local checkpoints in order that all local checkpoints be members of some consistent global checkpoint. We first study the characteristics of such adaptive algorithms. Then, a general adaptive checkpointing algorithm is designed from a condition, first stated by Netzer and Xu, that answers the following question: ‘does a given local checkpoint belong to a consistent global checkpoint’' (such a local checkpoint is not useless). The resulting algorithm has the nice property to reduce the number of additional local checkpoints taken to ensure the property ‘no local checkpoint is useless’. Futhermore, it provides each local checkpoint with a consistent global checkpoint to which it belongs. Compared to uncoordinated and coordinated checkpointing algorithms, this algorithm combines the advantages of both without inheriting their drawbacks.  相似文献   

11.
A survey of wormhole routing techniques in direct networks   总被引:10,自引:0,他引:10  
Ni  L.M. McKinley  P.K. 《Computer》1993,26(2):62-76
Several research contributions and commercial ventures related to wormhole routing, a switching technique used in direct networks, are discussed. The properties of direct networks are reviewed, and the operation and characteristics of wormhole routing are discussed in detail. By its nature, wormhole routing is particularly susceptible to deadlock situations, in which two or more packets may block one another indefinitely. Several approaches to deadlock-free. routing, along with a technique that allows multiple virtual channels to share the same physical channel, are described. In addition, several open issues related to wormhole routing are discussed  相似文献   

12.
Communication overhead is the key obstacle to reaching hardware performance limits. The majority is associated with software overhead, a significant portion of which is attributed to message copying. To reduce this copying overhead, we have devised techniques that do not require to copy a received message in order for it to be bound to its final destination. Rather, a late-binding mechanism, which involves address translation and a dedicated cache, facilitates fast access to received messages by the consuming process/thread.We have introduced two policies namely Direct to Cache Transfer (DTCT) and lazy DTCT that determine whether a message after it is bound needs to be transferred into the data cache. We have studied the proposed methods in simulation and have shown their effectiveness in reducing access times to message payloads by the consuming process.  相似文献   

13.
We propose extensions to the message passing interface (MPI) that generalize the MPI communicator concept to allow multiple communication endpoints per process, dynamic creation of endpoints, and the transfer of endpoints between processes. The generalized communicator construct can be used to express a wide range of interesting communication structures, including collective communication operations involving multiple threads per process, communications between dynamically created threads or processes, and object-oriented applications in which communications are directed to specific objects. Furthermore, this enriched functionality can be provided in a manner that preserves backward compatibility with MPI. We describe the proposed extensions, illustrate their use with examples, and describe a prototype implementation in the popular MPI implementation MPICH  相似文献   

14.
15.
Multicast communication services, in which the same message is delivered from a source node to an arbitrary number of destination nodes, are being provided in new-generation multicomputers. Broadcast is a special case of multicast in which a message is delivered to all nodes in the network. The nCUBE-2, a wormhole-routed hypercube multicomputer, provides hardware support for broadcast and a restricted form of multicast in which the destinations form a subcube. However, the broadcast routing algorithm adopted in the nCUBE-2 is not deadlock-free. In this paper, four multicast wormhole routing strategies for 2-D mesh multicomputers are proposed and studied. All of the algorithms are shown to be deadlock-free. These are the first deadlock-free multicast wormhole routing algorithms ever proposed. A simulation study has been conducted that compares the performance of these multicast algorithms under dynamic network traffic conditions in a 2-D mesh. The results indicate that a dual-path routing algorithm offers performance advantages over tree-based, multipath, and fixed-path algorithms  相似文献   

16.
Distributed memory multiprocessor (DMMP) systems have gained much attention because their performance can be easily scaled up by increasing the number of processor-memory modules. The k-ary n-cube is the most popular interconnection network topology currently used in DMMPs. Wormhole routing is one of the most promising switching technology and has been used in many new generation multicomputers. Wormhole routing makes the communication latency insensitive to the network diameter and reduces the size of the channel buffer of each router. The concept of virtual channels and virtual networks are widely invented for deadlock-free design. A fully adaptive wormhole routing method for k-ary n-cubes has been proposed by Linder in 1991 [10]. Unfortunately, the need of 2n − 1 virtual networks makes it unreasonable. In this paper, we propose a virtual network system to support an adaptive, minimal and deadlock free routing in k-ary n-cubes. It uses only four virtual networks but can get a higher degree of adaptability and higher traffic capacity. Simulation results are presented to verify the performance.  相似文献   

17.
The execution of a concurrent computation by a network of processors requires a routing algorithm that is deadlock free. Many routing algorithms proposed for processor networks have the potential of deadlock due to the cyclic topology of the network. In this paper we first formalize the concept of message routing. Next, we show a method by which a deadlock-free routing algorithm can be constructed out of a given routing algorithm. Finally the method is illustrated by constructing deadlock-free routing algorithms for cartesian product processor networks. Peter A.J. Hilbers received the B.S. and M.S. degrees in mathematics, and the Ph.D. degree in computer science from Groningen University, Groningen, The Netherlands, in 1979, 1983, and 1989, respectively. From 1988 to 1989 he was an Assistant Professor with the Department of Computing Science at Groningen University. Currently he is a research engineer in the Department of Mathematics and Systems Engineering at the Koninklijke/Shell-Laboratorium, Amsterdam (Shell Research B.V.). His research intersts are program derivation and correctness, concurrency, and processor networks. Johan J. Lukkien received the B.S. and M.S. degrees in mathematics from Groningen University, Groningen, The Netherlands in 1982 and 1986 respectively. Currently he is a Ph.D. student at the Department of Computing Science, Groningen University. His research area is the construction and verification of concurrent programs.  相似文献   

18.
A new theory of deadlock-free adaptive routing in wormhole networks   总被引:4,自引:0,他引:4  
The theoretical background for the design of deadlock-free adaptive routing algorithms for wormhole networks is developed. The author proposes some basic definitions and two theorems. These create the conditions to verify that an adaptive algorithm is deadlock-free, even when there are cycles in the channel dependency graph. Two design methodologies are also proposed. The first supplies algorithms with a high degree of freedom, without increasing the number of physical channels. The second methodology is intended for the design of fault-tolerant algorithms. Some examples are given to show the application of the methodologies. Simulations show the performance improvement that can be achieved by designing the routing algorithms with the new theory  相似文献   

19.
《Parallel Computing》1997,23(13):1937-1962
A fault-tolerant wormhole routing algorithm on mesh-connected processors is proposed. The proposed algorithm is based on the solid fault model and allows the fault polygons to be overlapped. The algorithm utilizes the position of fault region relative to the current channel to route a message around overlapped fault polygons. A node deactivating algorithm to convert a non-solid fault region into a solid fault region is also proposed. The proposed routing algorithm uses four virtual channels and is deadlock- and livelock-free.  相似文献   

20.
Timeout in a message passing system is studied with emphasis on the notion of predicate transfer. Under certain conditions, a predicate describing the state of the sender can be transferred to the receiver when a timeout occurs. These conditions relate to the way the message passing system is implemented. This relationship is discussed and a technique for formally describing the semantics of message passing with timeout is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号