期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Cost‐effective queue schemes for reducing head‐of‐line blocking in fat‐trees

J. Escudero‐Sahuquillo P. J. Garcia F. J. Quiles J. Flich J. Duato 《Concurrency and Computation》2011,23(17):2235-2248

The fat‐tree is one of the most common topologies among the interconnection networks of the systems currently used for high‐performance parallel computing. Among other advantages, fat‐trees allow the use of simple but very efficient routing schemes. One of them is a deterministic routing algorithm that has been recently proposed, offering a similar (or better) performance than adaptive routing while reducing complexity and guaranteeing in‐order packet delivery. However, as other deterministic routing proposals, this deterministic routing algorithm cannot react when high traffic loads or hot‐spot traffic scenarios produce severe contention for the use of network resources, leading to the appearance of Head‐of‐Line (HoL) blocking, which spoils the network performance. In that sense, we describe in this paper two simple, cost‐effective strategies for dealing with the HoL‐blocking problem that may appear in fat‐trees with the aforementioned deterministic routing algorithm. From the results presented in the paper, we conclude that, in the mentioned environment, these proposals considerably reduce HoL‐blocking without significantly increasing switch complexity and the required silicon area. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

2.

Beyond Fat--tree: Unidirectional Load--Balanced Multistage Interconnection Network

Gomez Requena Crispin Gilabert Villamon Francisco Gomez Maria Lopez Pedro Duato Jose 《Computer Architecture Letters》2008,7(2):49-52

The fat-tree is one of the most widely-used topologies by interconnection network manufacturers. Recently, it has been demonstrated that a deterministic routing algorithm that optimally balances the network traffic can not only achieve almost the same performance than an adaptive routing algorithm but also outperforms it. On the other hand, fat-trees require a high number of switches with a non-negligible wiring complexity. In this paper, we propose replacing the fat--tree by a unidirectional multistage interconnection network (UMIN) that uses a traffic balancing deterministic routing algorithm. As a consequence, switch hardware is almost reduced to the half, decreasing, in this way, the power consumption, the arbitration complexity, the switch size itself, and the network cost. Preliminary evaluation results show that the UMIN with the load balancing scheme obtains lower latency than fat--tree for low and medium traffic loads. Furthermore, in networks with a high number of stages or with high radix switches, it obtains the same, or even higher, throughput than fat-tree. 相似文献

3.

A Comment on "Beyond Fat-tree: Unidirectional Load-Balanced Multistage Interconnection Network"

Antelo E. 《Computer Architecture Letters》2009,8(1):33-34

A recent work proposed to simplify fat-trees with adaptive routing by means of a load-balancing deterministic routing algorithm. The resultant network has performance figures comparable to the more complex adaptive routing fat-trees when packets need to be delivered in order. In a second work by the same authors published in IEEE CAL, they propose to simplify the fat-tree to a unidirectional multistage interconnection network (UMIN), using the same load-balancing deterministic routing algorithm. They show that comparable performance figures are achieved with much lower network complexity. In this comment we show that the proposed load-balancing deterministic routing is in fact the routing scheme used by the butterfly network. Moreover we show that the properties of the simplified UMIN network proposed by them are intrinsic to the standard butterfly and other existing UMINs. 相似文献

4.

基三分层互连网络及其路由算法设计

乔保军石峰计卫星《计算机工程与设计》2007,28(18):4390-4393

从降低节点度、减少网络链路数和缩短网络直径的角度出发,提出一种新型的互连网络结构--基三分层互连网络,深入地研究了该网络的静态度量并和2-D Mesh做了相应的比较.针对基三分层互连网络提出了一种使消息沿两节点间确定路径传递的分布式确定路由算法DDRA.该算法充分利用基三分层互连网络的层次特性,不需要构建路由表,且算法实现简单,路由效率高,且易于硬件实现. 相似文献

5.

基于Base-mn-Cube的路由算法

下载免费PDF全文

唐玉华徐炜遐《计算机工程与科学》1998,20(3):32-35

Ｂａｓｅ－ｎｍ－Ｃｕｂｅ是一种新型的ＭＰＰ互连网络，具有平均距离短，易实现等优点。相似文献

6.

基于PRDT的16节点NoC路由算法

下载免费PDF全文

段新明杨愚鲁杨梅《计算机工程》2007,33(9):12-14,18

网络结构对于片上网络系统的性能和功耗发挥着重要作用，PRDT(2,1)有着较低的网络直径和平均距离、常数的节点度以及良好的可扩展性，这些特点使其非常适于NoC。为了提高小规模PRDT的路由性能，该文提出了一种binary路由算法，当网络规模不大于16时，该算法无须使用虚拟通道即可实现无死锁路由，通过增加少量虚拟通道，可改进为完全自适应路由算法。对所提出的路由算法与原有的向量路由算法进行仿真比较，结果显示binary算法在硬件成本较低的同时，性能更为优异，完全可以应用于基于PRDT的小规模NoC网络。相似文献

7.

Routing performance enhancement in hierarchical torus network by link-selection algorithm

《Journal of Parallel and Distributed Computing》2005,65(11):1453-1461

A hierarchical torus network (HTN) is a 2D-torus network of multiple basic modules, in which the basic modules are 3D-torus networks that are hierarchically interconnected for higher-level networks. The static network performance of the HTN and its dynamic communication performance using the popular dimension-order routing algorithm have already been evaluated and shown to be superior to the performance of other conventional and hierarchical interconnection networks. In this paper, we propose a link-selection algorithm for efficient use of physical links of the HTN, while keeping the link-selection algorithm as simple as the dimension-order routing algorithm. We also prove that the proposed algorithm for the HTN is deadlock-free using three virtual channels. We evaluate the dynamic communication performance of an HTN using dimension-order routing and link-selection algorithms under various traffic patterns. We find that the dynamic communication performance of an HTN using the link-selection algorithm is better than when the dimension-order routing algorithm is used. 相似文献

8.

HDRA:一种基于历史寻径信息的分布式路由算法

乔保军毋琳计卫星《计算机应用与软件》2006,23(6):98-99,137

直接互连网络已成为构建大规模并行系统的主流网络互连体系结构,路由算法对互连网络的通信性能和并行系统性能的发挥起着重要作用。针对静态互连网络,提出一种新的基于路由表查找技术的分布式路由算法HDRA,该算法有效地利用历史寻径信息,加快路由寻径速度,提高网络传输性能,而且算法设计简单,易于硬件实现。相似文献

9.

An integrated solution for QoS provision and congestion management in high-performance interconnection networks using deterministic source-based routing

Juan A. Villar Pedro J. García Francisco J. Alfaro José L. Sánchez Francisco J. Quiles 《The Journal of supercomputing》2013,66(1):284-304

A key element in any system based on several interconnected computing and/or storage nodes is the interconnection network. Currently, one of the main concerns of high-speed interconnection network designers is how to improve network performance while using the minimum number of network resources. In that sense, in this paper we describe an efficient switch architecture suitable for any interconnect technology implementing deterministic source-based routing. This switch architecture uses the same network resources to provide two issues that improve network performance: Congestion Management and QoS support. We also present results to compare the effectiveness of this architecture to those of other proposals typically used to provide these issues in this context. These results have been obtained for synthetic traffic and for traces from parallel benchmarks and video frames. From the results, we can conclude that in any traffic scenario, our proposal is as effective as the previous ones, while requiring fewer resources and thus being much more cost-effective. 相似文献

10.

Explanation of Performance Degradation in Turn Model

Slavko Gajin Zoran Jovanović 《The Journal of supercomputing》2006,37(3):271-295

The Turn model routing algorithms for mesh interconnection network achieve partial adaptivity without any virtual channels. However, the routing performance measured by simulations is worse than with the simple deterministic routing algorithm. Authors have explained these results simply by uneven dynamic load through the network. However, this phenomenon has not been studied further. This paper investigates performance degradation with Turn model and drawbacks of partially adaptive routing in comparison with the deterministic routing, and it introduces some new concepts. Our simulations deal with individual channels and results are presented by 3D graphs, rather than by commonly used averages. An additional parameter—channel occupation, which is consistent with queuing theory commonly used in many proposed analytical models, is introduced. We also propose a new structure, the Channel Directions Dependency Graph (CDDG). It provides a new approach in analysis, helps in understanding of dynamic routing behaviour, and it can be generalized in other routing algorithms. 相似文献

11.

E-2D Torus网络结构中的无死锁路由算法

顾华玺 QIU Zhi-liang 邱智亮涂小行刘亚社《小型微型计算机系统》2005,26(7):1140-1144

研究了太比特路由器核心交换网络拓扑的一种新结构-E-2Dtorus网络.该网络具有简单,对称,可扩展等优势.提出了适用于该网络结构的两种路由算法NPN(NoPositivetoNegative)和IDO(ImprovedDimensionOrder).部分自适应的NPN和确定性的IDO都是无死锁,无活锁且最短的路由算法.同时给出了无死锁无活锁的证明.最后,在8×8的E-2Dtorus网络上对路由算法进行仿真,结果表明E-2Dtorus是一种有潜力的网络拓扑结构,两种路由算法具有良好的性能. 相似文献

12.

A performance model for analysis of heterogeneous multi-cluster systems 总被引：1，自引：0，他引：1

Bahman Javadi Mohammad K. Akbari Jemal H. Abawajy 《Parallel Computing》2006,32(11-12):831

This paper addresses the problem of performance modeling for large-scale heterogeneous distributed systems with emphases on multi-cluster computing systems. Since the overall performance of distributed systems is often depends on the effectiveness of its communication network, the study of the interconnection networks for these systems is very important. Performance modeling is required to avoid poorly chosen components and architectures as well as discovering a serious shortfall during system testing just prior to deployment time. However, the multiplicity of components and associated complexity make performance analysis of distributed computing systems a challenging task. To this end, we present an analytical performance model for the interconnection networks of heterogeneous multi-cluster systems. The analysis is based on a parametric family of fat-trees, the m-port n-tree, and a deterministic routing algorithm, which is proposed in this paper. The model is validated through comprehensive simulation, which demonstrated that the proposed model exhibits a good degree of accuracy for various system organizations and under different working conditions. 相似文献

13.

A Reliable and Fault-Tolerant Interconnection Network 总被引：2，自引：0，他引：2

下载免费PDF全文

Deng Yaping Chen Tinghuai 《计算机科学技术学报》1990,5(2):117-126

An interconnection network with multistage redundant paths is introduced for using in high-performance multiprocessor systems.The routing algorithm of the proposed network is simple and dynamically reroutable.The analysis of the fault-tolerance and performance of the network are given.It is shown that the probability of acceptance and the performance-to-cost ration of the network are better than those of F and Gamma Networks.Another advantages of the proposed network is the smaller amount of interstage links compared with F network. 相似文献

14.

双工k-ary n-mesh的虫孔路由分析 总被引：4，自引：1，他引：3

肖晓强胡华平金士尧《计算机学报》2000,23(1):83-89

现代多处理机系统的互联网络多采用虫孔路由流控制。该文针对虫孔路由流控制和确定性路由算法下的双工ｋ－ａｒｙｎ－ｍｅｓｈ计算机互联网,采用倒推算法建立了求解消息平均传输延迟的分析模型,并建立仿真模型,理论分析与仿真结果基本吻合,表明该分析模型具有较好的精确度。相似文献

15.

A novel 3D NoC architecture based on De Bruijn graph 总被引：1，自引：0，他引：1

Yiou Chen^{Author Vitae} Jianhao HuAuthor VitaeXiang LingAuthor Vitae Tingting HuangAuthor Vitae 《Computers & Electrical Engineering》2012,38(3):801-810

Networks on Chip (NoC) and 3-Dimensional Integrated Circuits (3D IC) have been proposed as the solutions to the ever-growing communication problem in System on Chip (SoC). Most of contemporary 3D architectures are based on Mesh topology, which fails to achieve small latency and power consumption due to its inherent large network diameter. Moreover, the conventional XY routing lacks the ability of fault tolerance. In this paper, we propose a new 3D NoC architecture, which adopts De Bruijn graph as the topology in physical horizontal planes by leveraging its advantage of small latency, simple routing, low power, and great scalability. We employ an enhanced pillar structure for vertical interconnection. We design two shifting based routing algorithms to meet separate performance requirements in latency and computing complexity. Also, we use fault tolerant routing to guarantee reliable data transmission. Our simulation results show that the proposed 3D NoC architecture achieves better network performance and power efficiency than 3D Mesh and XNoTs topologies. 相似文献

16.

Routing Properties of a Recursive Interconnection Network

《Journal of Parallel and Distributed Computing》2001,61(6):838-849

In this paper, we consider a highly recursive interconnection network known as the fully connected cubic network (FCCN). By exploiting its recursive properties, we thoroughly analyze the performance of a simple routing algorithm for the FCCN. We show that at least 80% of the routes obtained from this simple algorithm are shortest paths, and this percentage increases further with the network size. Subsequently, we obtain the network diameter and average internodal distance, taking into account the communication locality that is exhibited in many parallel computations. The presence of the communication locality significantly reduces the average internodal distance. 相似文献

17.

VTFTR：高维胖树中的无死锁容错路由算法

刘博阳胡舒凯施得君卢宏生《计算机工程》2022,48(12):38

随着近年来高性能计算系统规模的急剧扩大,高性能互连网络的可靠性成为愈发重要的问题。高维胖树是一种结合了胖树与多维环网优点的网络拓扑结构,凭借其良好的可扩展性与网络性能在E级时代具有广阔的应用前景。然而,目前关于高维胖树中容错路由算法的相关研究较为有限,其可靠性问题亟待解决。为提高高维胖树拓扑在高性能互连网络中的容错能力,进一步提高对应超算系统的运行效率,提出一种用于高维胖树中叶交换机故障的容错路由算法VTFTR。该算法结合转向模型与虚通道切换的思想,通过严格控制报文在无故障路径与容错路径中的转向,使用少量的容错虚通道与额外跳步实现高维胖树中的无死锁容错。实验结果表明,在单点故障情况下,VTFTR算法的容错路径较对比算法有2~4个跳步的减少,在4 096个节点规模的网络中,当叶交换机故障数量为10时,在故障叶交换机不同的分布情况下,该算法能够以1.4%~2.0%的吞吐率下降作为代价来保持全网无故障节点之间的互连。相似文献

18.

XOR-based HoL-blocking reduction routing mechanisms for direct networks

《Parallel Computing》2017

Routing is a key design parameter in the interconnection network of large parallel computers . Routing algorithms are classified into two different categories depending on the number of routing options available for each source–destination pair: deterministic (there is one path available) and adaptive (there are several ones). Adaptive routing has two opposed effects on network performance. On one hand, it provides routing flexibility that may help on avoiding a congested network area, thus improving network performance. On the other hand, it also may increase the Head-of-Line blocking effect due to more destination nodes sharing the port queues. Usually, adaptive routing uses virtual channels to provide routing flexibility and to guarantee deadlock freedom. Deterministic routing is simpler, which implies lower routing delay and it introduces less Head-of-Line blocking effect. In this paper, we propose an adaptive and HoL-blocking reduction routing algorithm for direct topologies that tries to combine the good properties of both worlds: It provides routing flexibility but also reduces the Head-of-Line blocking effect. To do that, this paper proposes several functions which use the XOR operation to efficiently distribute the packets among virtual channels based on their destination node. The resulting routing mechanisms have different properties depending on whether they enforce routing flexibility or Head-of-Line blocking reduction. 相似文献

19.

一种新型大容量路由器交换网络中的高效路由策略

杨君刚邱智亮刘增基李红卫《计算机科学》2007,34(3):35-37

路由算法对交换网络性能具有很大的影响。本文针对一种新型的大容量路由器交换网络拓扑-XD（Cross-Direct）网络的特点,提出了一类和应用于传统直连网络中的基于简单维序路由算法具有相同网络性能的路由算法一对角矢量映射法（DVM）。该类算法分为两种,文中对这两种算法进行了详细描述和性能分析,给出了它们各自的应用场合。相似文献

20.

LCFAA:一个低代价的完全自适应路由算法

刘燕孙利民 YANG Xiao-Dong 《计算机研究与发展》1999,36(3):331-336

大规模并行处理机系统（ＭＰＰ）中路由算法对互联网络通信性能和系统性能起着重要作用。自适应路由算法具有灵活性好、网络的通道利用率高和网络容错能力强等优点,但其实现难度较大,因而目前仅在少数ＭＰＰ系统中得以实现。文中在ｍｅｓｈ结构上提出了一个低代价无死锁的安全自适应最短虫孔路由算法ＬＣＦＡＡ,该算法所需虚通道数少,具有代价低、自适应性强的特点。文中证明了算法的无死锁、无活锁性和完全自适应性,并模拟验证相似文献