期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

欧阳一鸣胡春雷梁华国谢涛《计算机工程》2012,38(13):237-239,243

为解决片上网络中故障路由器与IP核的通信问题,设计一种低硬件开销的双端口资源网络接口,在传统2D-mesh结构基础上,通过添加部分链路,将每个IP核连接到2个路由器上,并针对该架构设计相应的容错路由算法。实验结果表明,该方案硬件开销较小、容错能力较强。相似文献

2.

Deadlock free routing algorithms for irregular mesh topology NoC systems with rectangular regions

Rickard Maurizio Shashi 《Journal of Systems Architecture》2008,54(3-4):427-440

The simplicity of regular mesh topology Network on Chip (NoC) architecture leads to reductions in design time and manufacturing cost. A weakness of the regular shaped architecture is its inability to efficiently support cores of different sizes. A proposed way in literature to deal with this is to utilize the region concept, which helps to accommodate cores larger than the tile size in mesh topology NoC architectures. Region concept offers many new opportunities for NoC design, as well as provides new design issues and challenges. One of the most important among these is the design of an efficient deadlock free routing algorithm. Available adaptive routing algorithms developed for regular mesh topology cannot ensure freedom from deadlocks. In this paper, we list and discuss many new design issues which need to be handled for designing NoC systems incorporating cores larger than the tile size. We also present and compare two deadlock free routing algorithms for mesh topology NoC with regions. The idea of the first algorithm is borrowed from the area of fault tolerant networks, where a network topology is rendered irregular due to faults in routers or links, and is adapted for the new context. We compare this with an algorithm designed using a methodology for design of application specific routing algorithms for communication networks. The application specific routing algorithm tries to maximize adaptivity by using static and dynamic communication requirements of the application. Our study shows that the application specific routing algorithm not only provides much higher adaptivity, but also superior performance as compared to the other algorithm in all traffic cases. But this higher performance for the second algorithm comes at a higher area cost for implementing network routers. 相似文献

3.

A fault tolerant NoC architecture using quad-spare mesh topology and dynamic reconfiguration

《Journal of Systems Architecture》2013,59(7):482-491

Network-on-Chip (NoC) is widely used as a communication scheme in modern many-core systems. To guarantee the reliability of communication, effective fault tolerant techniques are critical for an NoC. In this paper, a novel fault tolerant architecture employing redundant routers is proposed to maintain the functionality of a network in the presence of failures. This architecture consists of a mesh of 2 × 2 router blocks with a spare router placed in the center of each block. This spare router provides a viable alternative when a router fails in a block. The proposed fault-tolerant architecture is therefore referred to as a quad-spare mesh. The quad-spare mesh can be dynamically reconfigured by changing control signals without altering the underlying topology. This dynamic reconfiguration and its corresponding routing algorithm are demonstrated in detail. Since the topology after reconfiguration is consistent with the original error-free 2D mesh, the proposed design is transparent to operating systems and application software. Experimental results show that the proposed design achieves significant improvements on reliability compared with those reported in the literature. Comparing the error-free system with a single router failure case, the throughput only decreases by 5.19% and latency increases by 2.40%, with about 45.9% hardware redundancy. 相似文献

4.

Experimental evaluation and comparison of two recent Network-on-Chip routers for FPGAs

《Microprocessors and Microsystems》2017

Rapid growth in the number of Intellectual Property (IP) cores in System-on-Chip (SoC) resulted in the need for effective and scalable interconnect scheme for system components – Network-on-Chip (NoC). Router is a key component in an NoC design that impacts the overall area utilization. It is crucial to evaluate the area efficiency of NoC routers. In this paper, we evaluate and compare two recent NoC routers for Field Programmable Gated Arrays (FPGAs). The first one is generated using the automated NoC synthesis tool CONfigurable NEtwork Creation Tool (CONNECT). The second one is an NoC router manually designed using VHDL and synthesized Altera Quartus II CAD tool. Three NoC topologies namely ring, mesh and torus are used for evaluating the two routers based on area utilization metric. The routers are evaluated by varying the node sizes from 4 to 16 for each topology. For smaller NoC topologies, CONNECT router uses less area but as the NoC size increases manual router design provides up to 85% reduction in area utilization. The results presented in this paper will be useful to designers interested in NoC implementation on FPGAs. 相似文献

5.

Application-Specific Network-on-Chip synthesis with flexible router Placement

《Journal of Systems Architecture》2013,59(7):361-371

Network-on-Chip (NoC) has been proposed as a possible solution to the communication problem in nanoscale System-on-Chip (SoC) design. NoC architectures with optimized application-specific topologies have been found to be superior to the regular architectures in designing Multi-Processor System-on-Chip (MPSoC) solutions. The application specific NoC design problem takes as input the chip floorplan, library of NoC components, and communication requirements between the tasks of the application. It outputs the positions of the routers in the floorplan, such that, all communication requirements of the application are satisfied. This paper presents an Integer Linear Programming formulation of the problem, followed by a heuristic technique based on Particle Swarm Optimization (PSO) for finding the router positions from the set of available positions within the chip floorplan. The goal is to minimize the communication cost between cores, satisfying both the link length and router port constraints. The results have been shown on realistic benchmarks. Comparisons have been carried out with regular mesh and custom architectures having routers positioned at (i) the corners of the cores, (ii) the centers of the cores, and (iii) the intersections of the cores. Significant reductions in communication cost have been observed over all the cases. For smaller benchmarks, the optimum results obtained via ILP matches exactly with those reported by the PSO. Many of the existing router placement policies fail even for these small benchmarks, when restrictions are imposed on permissible link length. This establishes the merit of the PSO formulation. Link and router energy consumption of the synthesized NoC have been compared with regular mesh based architectures. The results show significant reduction in communication cost, area overhead, link energy and router energy in the synthesized NoC over regular mesh topology as well. 相似文献

6.

Tolerating transient illegal turn faults in NoCs

《Microprocessors and Microsystems》2016

Network-on-Chip (NoC) is becoming a competitive solution to connect hundreds of processing elements in modern computing platforms. Under the trend of shrinking feature sizes, circuits are likely to suffer from faults which lead to degraded performance and erroneous behaviour. Compared to permanent faults, transient faults happen even more frequently and seriously while they are hidden within complex on-chip behaviours. One of the serious consequences caused by transient faults is taking illegal turns by the packets after the damage of control logic in on-chip routers which may lead to a deadlock situation and eventually crashing the entire system. To avoid this situation, in this paper, we propose a comprehensive scheme called ODT including an improved router architecture, an illegal-turn-resilient routing algorithm, online fault-detect units and a fault classification method. By applying ODT, more turns are supported on routing level and the deadlock situations can be significantly reduced. Experimental results indicate up to 22% increase of the survived packets in the network when 4% of routing computation units in failure. The extra area overhead and power consumption of ODT method is around 9.22% and 9.63%. 相似文献

7.

基于2D Mesh拓扑结构的NoC模拟器设计

乐建亮《现代计算机》2010,(3):139-144

片上网络模拟器的设计涉及到片上网络的拓扑结构、路由器结构、路由算法、性能分析等诸多方面.从NoC模拟器设计的角度,研究并讨论模拟器所采用的拓扑结构,路由器结构及数据包格式,介绍拓扑结构模拟、IP核模拟、路由模拟,并且用面向对象语言C++实现一个NoC模拟器系统. 相似文献

8.

Making-a-stop: A new bufferless routing algorithm for on-chip network 总被引：1，自引：0，他引：1

Jing Lin Xiaola Lin Liang Tang 《Journal of Parallel and Distributed Computing》2012

In the deep submicron regime, the power and area consumed by router buffers in network-on-chip (NoC) have become a primary concern. With buffers elimination, bufferless routing is emerging as a promising solution to provide power-and-area efficiency for NoC. In this paper, we present a new bufferless routing algorithm that can be coupled with any topology. The proposed routing algorithm is based on the concept of making-a-stop (MaS), aiming to deadlock and livelock freedom in wormhole-switched NoC. Performance evaluation is carried out by using a flit-level, cycle-accurate network simulator under synthetic traffic scenarios. Simulation results indicate that the proposed routing algorithm yields an improvement over the recent bufferless routing algorithm in average latency, power consumption, and area overhead by up to 10%, 9%, and 80%, respectively. 相似文献

9.

一种适用于2D Mesh片上网络的可重构容错路由算法

石泽文曾晓洋虞志益《小型微型计算机系统》2012,33(1):178-182

适用于2D Mesh片上网络的可重构容错路由算法,在芯片某些区域由于制造缺陷、使用老化等原因拓扑结构变得不再规整的时候,可以对网络节点重新进行配置,从而保证健康节点间的正常通信.基于SystemC的平台仿真表明该算法相对于传统算法可以获得更佳的网络性能.该算法是免于死锁的,同时对其可重构机制也给出了详细的论述.它还具有良好的扩展性,当系统规模增大的时候每个路由器的硬件开销保持恒定,而其容错能力也得到了增强. 相似文献

10.

面向非全互连3D NoC的自适应单播路由算法

孙美东刘勤让刘冬培燕昺昊《计算机应用》2018,38(5):1470-1475

针对在非全互连三维片上网络（3D NoC）架构中的硅通孔（TSV）表只存储TSV地址信息,导致网络拥塞的问题,提出了记录表结构。该表不仅可以存储距离路由器最近的4个TSV地址,也可存储相应路由器输入缓存的占用和故障信息。在此基础上,又提出最短传输路径的自适应单播路由算法。首先,计算当前节点与目的节点的坐标确定数据包的传输方式;其次,检测传输路径是否故障,同时获取端口缓存占用信息;最后,确定最佳的传输端口,传输数据包到邻近路由器。两种网络规模下的实验结果表明,与Elevator-First算法相比,所提算法在平均延时和吞吐率性能指标上有明显的优势,且在网络故障率为50%时,Random和Shuffle流量模型下的丢包率分别为25.5%和29.5%。相似文献

11.

A High-Throughput Distributed Shared-Buffer NoC Router 总被引：1，自引：0，他引：1

Soteriou V. Ramanujam R.S. Lin B. Li-Shiuan Peh 《Computer Architecture Letters》2009,8(1):21-24

Microarchitectural configurations of buffers in routers have a significant impact on the overall performance of an on-chip network (NoC). This buffering can be at the inputs or the outputs of a router, corresponding to an input-buffered router (IBR) or an output-buffered router (OBR). OBRs are attractive because they have higher throughput and lower queuing delays under high loads than IBRs. However, a direct implementation of OBRs requires a router speedup equal to the number of ports, making such a design prohibitive given the aggressive clocking and power budgets of most NoC applications. In this letter, we propose a new router design that aims to emulate an OBR practically based on a distributed shared-buffer (DSB) router architecture. We introduce innovations to address the unique constraints of NoCs, including efficient pipelining and novel flow control. Our DSB design can achieve significantly higher bandwidth at saturation, with an improvement of up to 20% when compared to a state-of-the-art pipelined IBR with the same amount of buffering, and our proposed microarchitecture can achieve up to 94% of the ideal saturation throughput. 相似文献

12.

CbRouter:一种利用交叉开关旁路的双向链路片上网络路由器

下载免费PDF全文

方磊董德尊吴际夏军王克非《计算机工程与科学》2015,37(2):199-206

片上互连网络为多核体系结构提供了高效的通信支持。目前的片上网络通常采用单向传输链路,链路资源利用率较低。为了实现链路带宽资源高效分配、进而高效利用链路带宽资源,提出了一种新的双向链路调度算法,并设计了一种支持此算法的双向链路路由器。这种新型的路由器结构能够在不影响路由原有数据通道条件下,提供一条旁路数据通道来快速传输数据。实验结果表明,应用该双向链路路由器可使Mesh网络饱和吞吐率和链路平均利用率分别得到最大83.3%和24.53%的提升。相似文献

13.

利用直连网络实现可扩展路由器

乐祖晖赵有健吴建平《软件学报》2007,18(10):2538-2550

Internet的迅速发展直接表现为用户流量的迅速增长,这就要求路由器必须提供更大的容量.传统的路由器由线卡和集中式交换网络构成.集中式交换网络只能支持有限的端口数目,而且随着端口数目的增加,调度算法也变得越来越复杂,所以交换网络正成为整个路由器的性能瓶颈.集中式交换网络还是路由器的单一失效点,无法提供令人满意的容错性能.直连网络具有良好的扩展性和容错性.其中,3-D Torus拓扑结构已被成功应用到可扩展路由器的设计当中.但是在实际应用中,3-D Torus结构受到等分带宽的约束,限制了扩展规模.介绍了一种新型的直连网络结构,称为蜂巢式结构.将对蜂巢结构作简单的改动,修改后的拓扑表现出很好的拓扑属性.基于该结构,提出了两类最短路径路由算法.其中,负载均衡的最短路径路由算法较好地利用了直连网络路径多样性的特点,针对均匀随机和Tornado两种类型的流量都表现出较低的分组延时和较高的吞吐量.另就队列长度和单节点调度算法等方面对路由算法的影响进行了讨论.蜂巢结构为可扩展路由器的设计提供了新的选择. 相似文献

14.

LDBR: Low-deflection bufferless router for cost-sensitive network-on-chip design

《Microprocessors and Microsystems》2014,38(7):669-680

In network-on-chip (NoC) designs, the bufferless router is more energy-efficient than the conventional router with buffers. However, in the bufferless network, deflections cause great performance loss. In this paper, three deflection models are firstly constructed for analyzing the causes of deflections. Then, we propose a low-deflection bufferless router (LDBR), in which a multi-channel network interface and a novel deflection routing based on turn model are designed for reducing the deflections during packet transmissions. Finally, LDBR is evaluated against the latest bufferless routers using synthetic and real-world traffic patterns. The experimental results exhibit that the deflection rate of LDBR network is reduced by 41% compared to other bufferless networks, and LDBR also shows superiority in cost and power consumption across all workloads. 相似文献

15.

An Efficient Network-on-Chip Router for Dataflow Architecture

下载免费PDF全文

Xiao-Wei Shen Xiao-Chun Ye Xu Tan Da Wang Lunkai Zhang Wen-Ming Li Zhi-Min Zhang Dong-Rui Fan Ning-Hui Sun 《计算机科学技术学报》2017,32(1):11-25

Dataflow architecture has shown its advantages in many high-performance computing cases. In dataflow computing, a large amount of data are frequently transferred among processing elements through the network-on-chip (NoC). Thus the router design has a significant impact on the performance of dataflow architecture. Common routers are designed for control-flow multi-core architecture and we find they are not suitable for dataflow architecture. In this work, we analyze and extract the features of data transfers in NoCs of dataflow architecture: multiple destinations, high injection rate, and performance sensitive to delay. Based on the three features, we propose a novel and efficient NoC router for dataflow architecture. The proposed router supports multi-destination; thus it can transfer data with multiple destinations in a single transfer. Moreover, the router adopts output buffer to maximize throughput and adopts non-flit packets to minimize transfer delay. Experimental results show that the proposed router can improve the performance of dataflow architecture by 3.6x over a state-of-the-art router. 相似文献

16.

On the design of hypermesh interconnection networks for multicomputers

《Journal of Systems Architecture》2000,46(9):779-792

Topology, routing algorithm, and router structure are among the most important factors that greatly influence network performance. This paper assesses the interaction of these elements on two related but distinct types of multicomputer networks, the binary n-cube (or cube) and the hypermesh. The analysis will show that the topological properties of the hypermesh confer an important advantage over the cube that makes the former a promising option for use in high-performance multicomputers. The hypermesh can use simple routing algorithms and cheap routers with little performance penalty. The cube, on the other hand, is constrained to the use of a specific routing algorithm and complex routers to take advantage of its rich connectivity. 相似文献

17.

DTBR: A dynamic thermal-balance routing algorithm for Network-on-Chip

Feiyang LiuAuthor VitaeHuaxi GuAuthor Vitae Yintang YangAuthor Vitae 《Computers & Electrical Engineering》2012,38(2):270-281

Network-on-Chip (NoC) replaces the traditional bus-based architecture to become the mainstream design methodology for future complex System-on-Chip (SoC). It introduces the principles of packet switching and interconnection network into SoC design, and achieves much better performance for its high bandwidth, scalability, reliability, etc. However, thermal problem, such as regional temperature differential and hotspot, is still one of the main designing constraints. This paper proposes a dynamic thermal-balance routing (DTBR) algorithm for Network-on-Chip, which can solve both of the two thermal problems. DTBR is a minimal adaptive routing algorithm based on an architectural thermal model. An efficient thermal-aware router is designed to implement the DTBR algorithm. According to the simulation results, the proposed DTBR algorithm can make the network thermal distribution more uniform and hotspot temperature is cut down about 20% in different traffic patterns. Moreover, DTBR will bring a profit for the performance of packet delay and network throughput compared with other routing algorithms. 相似文献

18.

Efficient routing techniques in heterogeneous 3D Networks-on-Chip

Michael Opoku Agyeman Ali AhmadiniaAlireza Shahrabi 《Parallel Computing》2013

Three-dimensional Networks-on-Chips (3D NoCs) have recently been proposed to address the on-chip communication demands of future highly dense 3D multi-core systems. Homogeneous 3D NoC topologies have many Through Silicon Vias (TSVs) which have a costly and complex manufacturing process. Also, 3D routers use more memory and are more power hungry than conventional 2D routers. Alternatively, heterogeneous 3D NoCs combine both the area and performance benefits of 2D and 3D static router architectures by using a limited number of TSVs. To improve the performance of heterogeneous 3D NoCs, we propose an adaptive router architecture which balances the traffic in such NoCs. Particularly, experimental results show that our proposed architecture significantly improves the performance up to 75% by replacing 2D static routers with adaptive 2D routers in heterogeneous 3D NoCs, while keeping the maximum clock frequency, power and energy consumption of the adaptive router nearly at the same level as the static router. 相似文献

19.

NISHA: A fault-tolerant NoC router enabling deadlock-free Interconnection of Subnets in Hierarchical Architectures

《Journal of Systems Architecture》2013,59(7):551-569

Decrease in the Integrated Circuit (IC) feature sizes leads to the increase in the susceptibility to transient and permanent errors. The growing rate of such errors in ICs intensifies the need for a wide range of solutions addressing reliability at various levels of abstractions. Network on Chip (NoC) architecture has been introduced to address the increasing demand for communication bandwidth among processing cores. The structural redundancy inherited in NoC-based system can be leveraged to improve reliability and compensate for the effects of failures. In this paper, we propose a fault-tolerant NoC router NISHA, which stands for No-deadlock Interconnection of Subnets in Hierarchical Architectures. Armed with a new flow control mechanism, as well as an enhanced Virtual Channel (VC) regulator, the proposed router can mitigate the effects of both transient and permanent errors. A Dynamic/Static virtual channel allocation with respect to the local and global traffic is supported in NISHA; thereby, it maintains a deadlock-free state in the presence of routers or link failures in hierarchical topologies. Experimental results show an enhanced operation of NoC applications as well as the decrease in the average latency and energy consumption. 相似文献

20.

Characterizing the impact of process variation on 45 nm NoC-based CMPs

C. Hernández^{Author Vitae} A. Roca Author VitaeJ. Flich Author Vitae F. Silla Author VitaeJ. Duato Author Vitae 《Journal of Parallel and Distributed Computing》2011,71(5):651-663

Current integration scales make possible to design chip multiprocessors with a large amount of cores interconnected by a NoC. Unfortunately, they also bring process variation, posing a new burden to processor manufacturers.Regarding the NoC, variability causes that the delays of links and routers do not match those initially established at design time. In this paper we analyze how variability affects the NoC by applying a new variability model to 100 instances of an 8 × 8 mesh NoC synthesized using 45 nm technology. We also show that GALS-based NoCs present communication bottlenecks due to the slower components of the network, which cause congestion, thus reducing performance. This performance reduction finally affects the applications being executed in the CMP because they may be mapped to slower areas of the chip. In this paper we show that using a mapping algorithm that considers variability data may improve application execution time up to 50%. 相似文献