期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The 2D digraph-based NoCs: attractive alternatives to?the 2D mesh NoCs

Reza Sabbaghi-Nadooshan Mehdi Modarressi Hamid Sarbazi-Azad 《The Journal of supercomputing》2012,59(1):1-21

This paper proposes two-dimensional directed graphs (or digraphs for short) as a promising alternative to the popular 2D mesh topology for networks-on-chip (NoCs). Mesh is the most popular topology for the NoCs, mainly due to its suitability for on-chip implementation and low cost. However, the fact that a digraph offers a lower diameter than its equivalent linear array of equal cost motivated us to evaluate digraphs as the underlying topology of NoCs. This paper introduces a family of NoC topologies based on three well-known digraphs, namely de Bruijn, shuffle-exchange, and Kautz. We study topological properties of the proposed topologies. We show that the proposed digraph-based topologies have several attractive features including constant node degree, low diameter and cost, and low zero load latency which result in superior performance over the mesh. We introduce a deadlock-free routing algorithm for the proposed NoC topologies and compare NoCs employing the proposed topologies and the mesh topology in terms of power consumption and performance. Simulation results also reveal that the proposed NoC topologies offer higher performance and consume lower power than the mesh NoC. 相似文献

2.

片上互连网络的功耗特征分析与优化

孙晓乐钱亚龙齐新新张云放陈娟袁远董勇《计算机工程与科学》2020,42(7):1141-1150

随着处理器核数的增加,片上互连网络NoC结构日趋复杂,导致片上互连网络功耗所占的比重和功耗分析的难度也在增加。片上互连网络的任务映射,既要保证多处理器核心之间通信的高性能,又要保证耗费尽可能少的功耗和面积,即在有限的功耗和面积开销下获得较高的性能。在进行任务映射时,核心之间的通信距离是减少任务通信功耗的关键。连续且近凸的区域有助于缩短任务的通信距离。分析了一种功耗最优的片上互连网络启发式映射算法（INC）,该算法由区域选择算法和节点映射算法组成。对区域选择算法的2个因子进行了改进,使应用总的通信开销最小化且保证后续应用以很小的通信代价进行区域选择。提出了新的基于选择区域的映射算法。它们在动态到达程序映射问题中的实验结果表明,新的区域选择算法和节点映射算法相比于INC,可以减少12.10%的通信功耗,并且带来11.23%的通信延迟优化。相似文献

3.

Thermal management in 3d networks-on-chip using dynamic link sharing

《Microprocessors and Microsystems》2017

3D integration is a practical solution for overcoming the problems of long and slow global wires in current and future generations of integrated circuits. This emerging technology stacks several die slices on top of each other in a single chip. It provides higher-bandwidth and lower-latency in the third dimension than a 2D design due to extremely shorter inter-layer distances. However, thermal challenges are a key impediment to stacking logic dies on top of each other. Particularly, routers in a 3D network-on-chip (NoC) are a main source of thermal hotspots, limiting the potential performance gains of the 3D integration. In this paper, we take advantage of the low-latency 3D vertical links to design a temperature-aware router architecture for 3D NoCs. This architecture reduces the peak temperature of routers, particularly routers that are farther from the heat sink, by balancing the traffic across all layers in a temperature-aware distributed way. This way, a router with high temperature can borrow the link and crossbar bandwidth of the routers in the layers closer to the heat sink to forward its packets, effectively offloading part of its traffic to them to reduce its temperature.Experimental results show that the proposed method can control the temperature of 3D NoCs and reduce the temperature gradient across the network with minimized negative impact on performance, compared to a state-of-the-art 3D NoC temperature management method. 相似文献

4.

Energy-aware routing in hybrid optical network-on-chip for future multi-processor system-on-chip

Lin Liu Yuanyuan Yang 《Journal of Parallel and Distributed Computing》2013

With the development of Multi-Processor System-on-Chip (MPSoC) in recent years, the intra-chip communication is becoming the bottleneck of the whole system. Current electronic network-on-chip (NoC) designs face serious challenges, such as bandwidth, latency and power consumption. Optical interconnection networks are a promising technology to overcome these problems. In this paper, we study the routing problem in optical NoCs with arbitrary network topologies. Traditionally, a minimum hop count routing policy is employed for electronic NoCs, as it minimizes both power consumption and latency. However, due to the special architecture of current optical NoC routers, such a minimum-hop path may not be energy-wise optimal. Using a detailed model of optical routers we reduce the energy-aware routing problem into a shortest-path problem, which can then be solved using one of the many well known techniques. By applying our approach to different popular topologies, we show that the energy consumed in data communication in an optical NoC can be significantly reduced. We also propose the use of optical burst switching (OBS) in optical NoCs to reduce control overhead, as well as an adaptive routing mechanism to reduce energy consumption without introducing extra latency. Our simulation results demonstrate the effectiveness of the proposed algorithms. 相似文献

5.

MoNoC: A monitored network on chip with path adaptation mechanism

《Journal of Systems Architecture》2014,60(10):783-795

Complex systems on chip containing dozens of processing resources with critical communication requirements usually rely on the use of networks on chip (NoCs) as communication infrastructure. NoCs provide significant advantages over simpler infrastructures such as shared busses or point to point communication, including higher scalability, more efficient energy management, higher bandwidth and lower average latency. Applications running on NoCs with more than 10% of bandwidth usage attest that the most significant portion of message latencies refers to buffered packets waiting to enter the NoC, whereas the latency portion that depends on the packet traversing the NoC is sometimes negligible. This work presents an adaptive routing architecture, named Monitored NoC (MoNoC), which is based on a traffic monitoring mechanism and the exchange of high priority control packets. This method enables to adapt paths by choosing less congested routes. Practical experiments show that the proposed path adaptation is a fast process, enabling to transmit packets with smaller latencies, up to 9 times smaller, by using non-congested NoC regions. 相似文献

6.

Adaptive inter-layer message routing in 3D networks-on-chip

Claudia RusuAuthor Vitae Lorena AnghelAuthor Vitae 《Microprocessors and Microsystems》2011,35(7):613-631

Existing routing algorithms for 3D deal with regular mesh/torus 3D topologies. Today 3D NoCs are quite irregular, especially those with heterogeneous layers. In this paper, we present a routing algorithm targeting 3D networks-on-chip (NoCs) with incomplete sets of vertical links between adjacent layers. The routing algorithm tolerates multiple link and node failures, in the case of absence of NoC partitioning. In addition, it deals with congestion. The routing algorithm for 3D NoCs preserves the deadlock-free propriety of the chosen 2D routing algorithms. It is also scalable and supports a local reconfiguration that complements the reconfiguration of the 2D routing algorithms in case of failures of nodes or links. The algorithm incurs a small overhead in terms of exchanged messages for reconfiguration and does not introduce significant additional complexity in the routers. Theoretical analysis of the 3D routing algorithm is provided and validated by simulations for different traffic loads and failure rates. 相似文献

7.

Lightweight fault localization combined with fault context to improve fault absolute rank

Wang Yong Huang Zhiqiu Li Yong Fang Bingwu 《中国科学:信息科学(英文版)》2017,60(9):1-15

Network-on-Chip (NoC) is a promising replacement of bus architecture due to its better scalability. In state-of-the-art NoCs, each packet contains several fixed-length flits, which facilitates allocations of network resources but brings in many unused bits. In this paper, we propose a novel technique called Stealth-ACK to effectively address the above problem. Stealth-ACK leverages unused bits in head flits of non-ACK packets to carry and stealthily transmit ACK information. Such stealth transmissions of ACK information effectively reduce not only the amount of dedicated ACK packets on NoC, but also the number of unused bits in head flits of non-ACK packets, which significantly reduces wastes on NoC bandwidth. Experimental results show that Stealth-ACK averagely increases the throughput of 16 × 16 2-D mesh NoC by 11.9%, and averagely reduces the NoC latency by 34.8% on application traces of SPLASH-2. Moreover, Stealth-ACK only requires trivial hardware modification to basic router architectures, which incurs negligible power consumption and area cost.

相似文献

8.

Deadlock-free generic routing algorithms for 3-dimensional Networks-on-Chip with reduced vertical link density topologies

《Journal of Systems Architecture》2013,59(7):528-542

3-Dimensional Networks-on-Chip (3D NoC) have emerged as the promising solution for scalability, power consumption and performance demands of next generation Systems-on-Chip (SoCs) interconnect. Due to the cost in terms of thermal, yield, chip area and design complexity, minimizing the number of Through-Silicon-Via (TSVs) in 3D ICs has become on the most important design issues. In this paper, we will present several stable, simple and deadlock-free generic routing algorithms for 3D NoCs with different reduced vertical link density topologies, which can maintain the 3D NoCs performance and save the system cost (TSV number, chip area, system power, etc.). The experimental results have been extracted from our cycle-accurate GSNOC simulator and have shown that our routing algorithms can maintain the system performance up to reducing 50% of TSVs number in comparison to the 100% TSVs number with ZXY routing algorithm configuration. 相似文献

9.

功耗限制下SOC互联总线测试完整性故障的结构的优化

张玲马学军《微计算机信息》2010,(8)

本文提出了一种SOC互联总线测试完整性故障的结构优化方法,本方法是在功耗限制下通过分配TAM使测试时间最小,从而优化了系统测试结构。本文先对测试测试集进行二维压缩分割SI测试集成几个SI组初始化测试结构,为每个核分配一位TAM,通过为每个的TAM进行计算后找出关键TAM,再通过在功耗限制下,反复分配空闲TAM给关键TAM和共享TAM的方法进行测试时间的减少。对ITC‘02的试验结果表明,本方法能在功耗限制下大大减少了SOC测试时间。相似文献

10.

有向无环图的高效归约算法

侯睿武继刚《计算机科学》2015,42(7):78-84

将一个应用程序部署到给定的片上网络上执行时,需要将应用程序中的每一个子任务都指派给片上网络中的一个节点执行。该问题一般被建模成一组子任务作为顶点的有向无环图,任务在片上网络上的部署过程就等同于一个有向无环图的顶点向一个片上网络拓扑映射的过程。而随着应用程序和片上网络规模的增大,计算一个最优的映射方案是典型的难解问题。为了加速有向无环图到片上网络拓扑的映射过程,提出了有向无环图的归约算法,使归约后的图中的顶点数量尽可能地与给定片上网络中的节点数量相同。提出的图归约算法可以有效地识别出所有可归约子图,这些可归约子图可被归约为单一顶点。新算法的适用范围从嵌套图扩展到了任意图,并且拥有与原算法相同的复杂度量级。还提出了一种并行化的算法思想来加速可归约子图的搜索过程。相似文献

11.

Characterizing the impact of process variation on 45 nm NoC-based CMPs

C. Hernández^{Author Vitae} A. Roca Author VitaeJ. Flich Author Vitae F. Silla Author VitaeJ. Duato Author Vitae 《Journal of Parallel and Distributed Computing》2011,71(5):651-663

Current integration scales make possible to design chip multiprocessors with a large amount of cores interconnected by a NoC. Unfortunately, they also bring process variation, posing a new burden to processor manufacturers.Regarding the NoC, variability causes that the delays of links and routers do not match those initially established at design time. In this paper we analyze how variability affects the NoC by applying a new variability model to 100 instances of an 8 × 8 mesh NoC synthesized using 45 nm technology. We also show that GALS-based NoCs present communication bottlenecks due to the slower components of the network, which cause congestion, thus reducing performance. This performance reduction finally affects the applications being executed in the CMP because they may be mapped to slower areas of the chip. In this paper we show that using a mapping algorithm that considers variability data may improve application execution time up to 50%. 相似文献

12.

A networks-on-chip architecture design space exploration – The LIB

Peng Liu Bingjie Xia Chunchang Xiang Xiaohang Wang Weidong Wang Qingdong YaoAuthor vitae 《Computers & Electrical Engineering》2009,35(6):817-836

Many on-chip network circuit and architecture techniques are incompatible with modern design flows, making them unsuitable for use in systems-on-chip. This paper presents a networks-on-chip (NoC) architecture design space exploration method for multi-processor systems-on-chip architecture. The NoC architecture design space is designed with a Layer-Interactive-Building block (LIB) methodology that is divided into three layers: application layer, link/network layer, and physical layer. The suggested LIB design paradigmatic philosophy provides modular building block structure in both hardware and software and the protocols for their interconnection in the three architecture layers. Using LIB the designer can easily select these building blocks to build application-specific NoCs to meet different application requirements such as media, graphic, software radio and communication network applications. The LIB provides the NoC building blocks, architecture interacting systems-on-chip components, the programming models and application mapping strategies. The LIB can be used as a complementary library and tools for future on-chip interconnection network design. 相似文献

13.

Stochastic communication for application-specific Networks-on-Chip

Nitin Durg Singh Chauhan 《The Journal of supercomputing》2012,59(2):779-810

In this paper, we have developed analytical stochastic communication technique for inter and intra-Networks-on-Chip (NoC) communication. It not only separates the computation and communication in Networks-in-Package (NiP) but also predicts the communication performance. Moreover, it will help in tracking of the lost data packets and their exact location during the communication. Further, the proposed technique helps in building the Closed Donor Controlled Based Compartmental Model, which helps in building Stochastic Model of NoC and NiP. This model helps in computing the transition probabilities, latency, and data flow from one IP to other IP in a NoC and among NoCs in NiP. From the simulation results, it is observed that the transient and steady state response of transition probabilities give state of data flow latencies among the different IPs in NoC and among the compartments of NoCs in NiP. Furthermore, the proposed technique produces low latency as compared to the latencies being produced by the existing topologies. 相似文献

14.

片上网络路由器及互连低成本测试方法

向东《集成技术》2013,2(6):1-7

三维设计的片上网络(Network-on-chip)是当前的热点研究专题。提出一种有效的片上网络路由器和互连测试方法显得非常重要。文章通过对路由器分类,提出了一种新的片上网络路由器测试方法。将不同输入、输出端口的路由器分为不同的类。相同分类的路由器是同构的,它们的测试集也是相同的。针对相同路由器提出了一个基于单播的多播方案 (Unicast-based Multicast),并提出了新的互连测试方法。实验结果表明文章方法是有效的。相似文献

15.

Bit-accurate energy estimation for Networks-on-Chip

《Journal of Systems Architecture》2017

Networks-on-Chip (NoCs) are recognized as the solution to address the communication bottleneck in a Multi-processor System-on-Chip (MPSoC). As NoCs represent a significant part of system consumption, MPSoC designers expect accurate power models in order to produce energy efficient systems. Nowadays, NoC simulators rely on power models that integrate link models without crosstalk modeling. In this study, we present Noxim-XT, a NoC simulator based on Noxim that embeds a link power model with crosstalk modeling. We show that the crosstalk effect has a deep impact on NoC energy consumption since our results demonstrate that classical models generate errors up to 45.5% on the whole NoC energy consumption estimation. In addition, this tool is able to run application-based traffic and we show that under application-based traffics, the energy estimation made by classical models overestimates the NoC energy consumption by up to 50%. 相似文献

16.

A generic FPGA prototype for on-chip systems with network-on-chip communication infrastructure

Mohammad Arjomand Amirali Boroumand Hamid Sarbazi-Azad 《Computers & Electrical Engineering》2014

As System-on-Chips (SoCs) grow in complexity and size, proposals of networks-on-chip (NoCs) as the on-chip communication infrastructure are justified by reusability, scalability, and energy efficiency provided by the interconnection networks. Simulation and mathematical analysis offer flexibility for the evaluations under various network configurations. However, the accuracy of such analyzing methods largely depends on the approximations made. On the other hand, prototyping can be used to improve the evaluation accuracy by bringing the design closer to reality. In this paper, we propose a FPGA prototype that is general enough to model different video-processing SoCs where different cores communicate via NoC. To model NoC, we accurately implement a fully-synthesized on-chip router supporting multiple virtual channels. For the processing nodes, on the other side, we propose a general and simple traffic generator capable of modeling different synthetic functions (i.e. Poisson and self-similar). Indeed, the application traffic is modeled using 1-D hybrid cellular automata which can effectively generate high quality pseudorandom patterns. Finally, for the energy efficiency, the proposed prototype is capable to support multiple frequency regions. To realize the voltage–frequency island partitioned SoC, we use the utilities that Xilinx FPGA platform offers to design Globally Synchronous Locally Asynchronous (GALS) systems via Delay-Locked Loop elements. 相似文献

17.

Improving the yield of NoC-based systems through fault diagnosis and adaptive routing

Caroline ConcattoAuthor VitaeJoão AlmeidaAuthor Vitae Guilherme FachiniAuthor VitaeMarcos HervéAuthor Vitae Fernanda KastensmidtAuthor Vitae Érika CotaAuthor VitaeMarcelo LubaszewskiAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(5):664-674

We propose an effective and low cost method to increase the yield and the lifetime of torus NoCs. The method consists in detecting and diagnosing NoC interconnect faults using BIST structures and activating alternative paths for the faulty links. Alternative paths use the inherent redundancy of the torus topology, thus leading to minimal performance, area, and power overhead. We assume an extended interconnect fault model comprising stuck-at and pairwise shorts within a single link or between any two links in the network. Experimental results for a 3×3 NoC show that the proposed approach can correctly diagnose 93% of all possible interconnect faults and can mitigate 42% of those faults (representing 94.4% of the solvable faults) with a worst case performance penalty of 8% and 1% of area overhead. We also demonstrate the scalability of the method by presenting its application to larger NoCs. 相似文献

18.

A study of 3D Network-on-Chip design for data parallel H.264 coding

Thomas Canhao XuAuthor Vitae Alexander Wei YinAuthor Vitae Pasi LiljebergAuthor Vitae Hannu TenhunenAuthor Vitae 《Microprocessors and Microsystems》2011,35(7):603-612

In this paper, we implement, analyze and compare different Network-on-Chip (NoC) architectures aiming at higher efficiencies for MPEG-4/H.264 coding. Two-dimensional (2D) and three-dimensional (3D) NoCs based on Non-Uniform Cache Access (NUCA) are analyzed. We present results using a full system simulator with realistic workloads. Experiments show the average network latencies in two 3D NoCs are reduced by 28% and 34% respectively, comparing with 2D design. It is also shown that heat dissipation is a trade-off in improving performance of 3D chips. Our analysis and experiment results provide a guideline to design efficient 3D NoCs for data parallel H.264 coding applications. 相似文献

19.

ProNoC: A low latency network-on-chip based many-core system-on-chip prototyping platform

《Microprocessors and Microsystems》2017

Network-on-chip (NoC) is an emerging interconnect infrastructure to address the scalability limitation of conventional shared bus architecture for many-core system-on-chip (MCSoC). Current field-programmable gate arrays (FPGAs) have over million lookup tables, making it possible to prototype a complete NoC-based MCSoC on a single FPGA device. FPGA prototyping allows rapid system verification and optimum design parameters estimation. However, existing NoC-based MCSoC prototypes are usually adopting simple NoC architectural functionality. These NoC prototypes cannot represent a realistic projection of the state-of-the-art application-specific integrated circuit (ASIC) NoCs as these prototypes have limited overall system performance. This paper presents ProNoC, an integrated tool for rapid prototyping and validation of NoC-based MCSoC projects targeting FPGA devices. ProNoC adopts most advanced NoC features such as the support of virtual channel (VC), virtual network, low latency routing and different routing algorithms. Results show that NoC interconnect in ProNoC outperforms CONNECT, the most recent VC based prototype NoC with lower logic cell utilization, higher maximum operating frequency, higher average saturation throughput, and lower average communication latency. Moreover, ProNoC is equipped with graphical user interface to facilitate the development of MCSoC prototypes on FPGA platforms. 相似文献

20.

CuNoC: A dynamic scalable communication structure for dynamically reconfigurable FPGAs

S. Jovanović C. Tanougast C. Bobda S. Weber 《Microprocessors and Microsystems》2009,33(1):24-36

The growing complexity of integrated circuits imposes to the designers to change and direct the traditional bus-based design concepts towards NoC-based. Networks on-chip (NoCs) are emerging as a viable solution to the existing interconnection architectures which are especially characterized by high level of parallelism, high performances and scalability. The already proposed NoC architectures in the literature are destined to System-on-chip (SoCs) designs. For a FPGA-based system, in order to take all benefits from this technology, the proposed NoCs are not suitable. In this paper, we present a new paradigm called CuNoC for intercommunication between modules dynamically placed on a chip for the FPGA-based reconfigurable devices. The CuNoC is based on a scalable communication unit characterized by unique architecture, arbitration policy base on the priority-to-the-right rule and modified XY adaptive routing algorithm. The CuNoC is namely adapted and suited to the FPGA-based reconfigurable devices but it can be also adapted with small modifications to all other systems which need an efficient communication medium. We present the basic concept of this communication approach, its main advantages and drawbacks with regards to the other main already proposed NoC approaches and we prove its feasibility on examples through the simulations. Performance evaluation and implementation results are also given. 相似文献