首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
在面向特定应用的片上系统中,不同模块之间的通信量和延迟需求差异很大,均等位宽的链路不能充分利用带宽资源。为此,提出一种非均匀的带宽分配方案,根据流量特征和竞争状况设定各链路的数据宽度,采用异构的互联结构合理分配连线资源并优化吞吐量。实验结果表明,在均匀流量模式下,非均匀位宽的异构网络和同构架构的吞吐量相近,而连线资源节省16%。在热点流量模式下,异构网络能够有效缓解局部拥塞状况,提高网络吞吐量。  相似文献   

2.
Network-on-Chip (NoC) interconnect fabrics are categorized according to trade-offs among latency, throughput, speed, and silicon area, and the correctness and performance of these fabrics in Field-Programmable Gate Array (FPGA) applications are assessed through experimentation and simulation. In this paper, we propose a consistent parametric method for evaluating the FPGA performance of three common on-chip interconnect architectures namely, the Mesh, Torus and Fat-tree architectures. We also investigate how NoC architectures are affected by interconnect and routing parameters, and demonstrate their flexibility and performance through FPGA synthesis and testing of 392 different NoC configurations. In this process, we found that the Flit Data Width (FDW) and Flit Buffer Depth (FBD) parameters have the heaviest impact on FPGA resources, and that these parameters, along with the number of Virtual Channels (VCs), significantly affect reassembly buffering and routing and logic requirements at NoC endpoints. Applying our evaluation technique to a detailed and flexible cycle accurate simulation, we drive the three NoC architectures using benign (Nearest Neighbor and Uniform) and adversarial (Tornado and Random Permutation) traffic patterns with different numbers of VCs, producing a set of load–delay curves. The results show that by strategically tuning the router and interconnect parameters, the Fat-tree network produces the best utilization of FPGA resources in terms of silicon area, clock frequency, critical path delays, network cost, saturation throughput, and latency, whereas the Mesh and Torus networks showed comparatively high resource costs and poor performance under adversarial traffic patterns. From our findings it is clear that the Fat-tree network proved to be more efficient in terms of FPGA resource utilization and is compliant with the current Xilinx FPGA devices. This approach will assist engineers and architects in establishing an early decision in the choice of right interconnects and router parameters for large and complex NoCs. We demonstrate that our approach substantially improves performance under a large variety of experimentation and simulation which confirm its suitability for real systems.  相似文献   

3.
袁景凌  刘华  谢威  蒋幸 《计算机应用》2011,31(10):2630-2633
为了满足片上网络日益丰富的应用要求,多播路由机制被应用到片上网络,以弥补传统单播通信方式的不足。以Mesh和Torus类的片上网络为例,分析了基于路径的3种多播路由算法(即XY路由、UpDown路由和SubPartition路由算法),并研究了相应的拥塞控制策略。通过模拟实验表明,多播较单播通信具有更小的平均传输延时和更高的网络吞吐量,且负载分配均匀;特别是SubPartition路由算法随着规模增大效果更加明显;提出的多播拥塞控制机制,能更有效地利用多播通信,提高片上网络的性能。  相似文献   

4.
片上互连网络为多核体系结构提供了高效的通信支持。目前的片上网络通常采用单向传输链路,链路资源利用率较低。为了实现链路带宽资源高效分配、进而高效利用链路带宽资源,提出了一种新的双向链路调度算法,并设计了一种支持此算法的双向链路路由器。这种新型的路由器结构能够在不影响路由原有数据通道条件下,提供一条旁路数据通道来快速传输数据。实验结果表明,应用该双向链路路由器可使Mesh网络饱和吞吐率和链路平均利用率分别得到最大83.3%和24.53%的提升。  相似文献   

5.
We present a single-cycle output buffered router based on layered switching for networks on chips (NoCs). Different from state-of-the-art NoC routers, the router has three important characteristics: (1) It employs layered switching, which implements wormhole on top of virtual cut-through (VCT) switching; (2) In contrast to input buffered architectures, it adopts an output buffered architecture; (3) It is single cycle, meaning that the router pipeline takes only one cycle for all flits. Experimental results show that the router achieves up to 80% of ideal network throughput under uniform random traffic pattern. Compared with wormhole switching, layered switching achieves up to 36.9% latency reduction for 12-flit packets under uniform random traffic with an injection rate of 0.5 flit/cycle/node. Under 65 nm technology synthesized results show that its critical path has only 20 logic gates, and it reduces 11% area compared to the input virtual-channel router with the same buffer capacity.  相似文献   

6.
网络拓扑的选择是NoC设计中的一个重要问题,目前典型的特定应用NoC系统通常集成多个不同功能、不同尺寸、不同通讯需求的组件,而规则的网络拓扑结构并不适于在这种类型的NoC中应用,因此不规则Mesh网络被提出并被应用于不规则结构的NoC系统.为解决规则 Mesh路由算法在不规则Mesh中无法保证路由连通性的问题,本文提出一种不规则Mesh无死锁路由算法,无论NoC系统集成组件的版图如何变化,这一算法始终是连通的,即算法与不规则Mesh的规模和结构是无关的,同时算法仅使用较低的虚拟通道.  相似文献   

7.
We present a new architecture level unified reliability evaluation methodology for chip multiprocessors (CMPs). The proposed reliability estimation (REST) is based on a Monte Carlo algorithm. What distinguishes REST from the previous work is that both the computational and communication components are considered in a unified manner to compute the reliability of the CMP. We utilize REST tool to develop a new dynamic reliability management (DRM) scheme to address time-dependent dielectric breakdown and negative-bias temperature instability aging mechanisms in network-on-chip (NoC) based CMPs. Designed as a control loop, the proposed DRM scheme uses an effective neural network based reliability estimation module. The neural-network predictor is trained using the REST tool. We investigate how system’s lifetime changes when the NoC as the communication unit of the CMP is considered or not during the reliability evaluation process and find that differences can be as high as 60%. Full-system based simulations using a customized GEM5 simulator show that reliability can be improved by up to 52% using the proposed DRM scheme in a best-effort scenario with 2–9% performance penalty (using a user set target lifetime of 7 years) over the case when no DRM is employed.  相似文献   

8.
3D NoC在同构多核系统中相比2D NoC具有更为优越的性能.本文在研究3D Mesh结构的基础上,对拓扑结构中的平均延时和理想吞吐量进行了理论上的评估,并提出了一种基于3D Mesh的新的静态路由算法,最后运用NS2网络仿真软件对其进行仿真和比较.实验结果显示,新的路由算法可以有效地提高吞吐量,并在大规模数据传输时...  相似文献   

9.
A key requirement for modern Networks-on-Chip (NoC) is the ability to detect and diagnose faults and failures. This paper addresses the challenge of fault diagnosis using online testing where the interruption of the runtime operation (performance) under diagnosis is minimised. A novel Monitor Module (MM) is proposed to detect NoC interconnect faults which minimise the intrusion of the regular NoC traffic throughput by (1) using a channel tester which only examines NoC channels when they are idle; and (2) using a testing interval parameter based on the Binary Exponential Back off algorithm to dynamically balance the level of testing when recovering from temporary faults. The paper presents results on the minimal impact on NoC throughput for a range of testing conditions and also highlights the minimal area overhead of the MM (11.56%) compared with an adaptive NoC router implemented on FPGA hardware. Simulation results demonstrate non-intrusion of the NoC runtime traffic throughput when channel are fault free, and also how throughput loss is minimised when faults are identified.  相似文献   

10.
Networks-on-Chip (NoCs) are recognized as the solution to address the communication bottleneck in a Multi-processor System-on-Chip (MPSoC). As NoCs represent a significant part of system consumption, MPSoC designers expect accurate power models in order to produce energy efficient systems. Nowadays, NoC simulators rely on power models that integrate link models without crosstalk modeling. In this study, we present Noxim-XT, a NoC simulator based on Noxim that embeds a link power model with crosstalk modeling. We show that the crosstalk effect has a deep impact on NoC energy consumption since our results demonstrate that classical models generate errors up to 45.5% on the whole NoC energy consumption estimation. In addition, this tool is able to run application-based traffic and we show that under application-based traffics, the energy estimation made by classical models overestimates the NoC energy consumption by up to 50%.  相似文献   

11.
本文对片上网络中的确定性XY路由算法和基于拐弯模型的4种自适应路由算法进行分析,并采用Noxim模拟器在6种合成通信模式下对5种路由算法的性能进行评估。实验结果表明,在均匀随机通信模式下,XY路由算法的性能优于自适应路由算法;在置换1和混洗通信模式下,奇偶路由算法的性能优于其他路由算法;在置换2、位反和蝶形通信模式下,负向优先路由算法的性能优于其他路由算法。  相似文献   

12.
Decrease in the Integrated Circuit (IC) feature sizes leads to the increase in the susceptibility to transient and permanent errors. The growing rate of such errors in ICs intensifies the need for a wide range of solutions addressing reliability at various levels of abstractions. Network on Chip (NoC) architecture has been introduced to address the increasing demand for communication bandwidth among processing cores. The structural redundancy inherited in NoC-based system can be leveraged to improve reliability and compensate for the effects of failures. In this paper, we propose a fault-tolerant NoC router NISHA, which stands for No-deadlock Interconnection of Subnets in Hierarchical Architectures. Armed with a new flow control mechanism, as well as an enhanced Virtual Channel (VC) regulator, the proposed router can mitigate the effects of both transient and permanent errors. A Dynamic/Static virtual channel allocation with respect to the local and global traffic is supported in NISHA; thereby, it maintains a deadlock-free state in the presence of routers or link failures in hierarchical topologies. Experimental results show an enhanced operation of NoC applications as well as the decrease in the average latency and energy consumption.  相似文献   

13.
NoC的网络拓扑结构是其研究的重要方面,在一些实际应用中,NoC系统通常集成多个不同功能、不同尺寸、不同通讯需求的组件,而规则的拓扑结构并不适应于在这种类型的NoC中应用,因此不规则Mesh网络被应用于不规则的NoC系统,为解决规则Mesh路由算法在不规则Mesh中无法保证路由连通性问题。提出一种不规则Mesh无死锁路由算法,同时此算法与其他算法相比,具有更少的虚通道和更优秀的路由路径选择。  相似文献   

14.
提出一种基于参数的层次化Mesh互连片上网络结构—PHNoC,解决片上网络规模扩张引起的通信延迟和吞吐性能下降问题。采用分簇多层次互连的思想,提高片上网络扩展性和连通性;引入层数和分簇类型参数,实现不同网络规模的灵活配置;引入跨层流控参数,控制并平衡层间负载流量。仿真试验表明,在多种流量模式下,不同网络规模时,PHNoC结构的延迟和吞吐性能相比传统的平面或两层结构优势明显,而资源开销和实现复杂度增加不大,说明增加多层互连资源可有效换取通信性能的提高。  相似文献   

15.
针对多接口多信道无线Mesh网络,提出了一种基于链路负载和链路“潜在的”干扰度的权值的分布式静态信道分配算法。给出基于链路负载和链路“潜在的”干扰度的权值的定义和基于权值的链表的构建方法;阐述了算法的设计思想和实现步骤。仿真实验测试结果表明,该算法不但能适应业务流量分布均匀或不均匀的状态,而且能相应地提高网络吞吐量,提升网络性能。  相似文献   

16.
三维片上网络研究综述   总被引:1,自引:0,他引:1  
张大坤  黄翠  宋国治 《软件学报》2016,27(1):155-187
三维片上网络以其更短的全局互连、更高的封装密度、更小的体积等诸多优势,已引起国内外学术界和产业界的高度重视.对三维片上网络的研究,将直接影响一个国家未来三维集成电路和三维芯片产业的发展,也关系到国家安全.近年来,三维片上网络逐渐成为片上网络研究领域的一个重要方向,已取得了许多研究进展,但仍然存在许多挑战性的课题.对三维片上网络的基本问题作了简介;分析了三维片上网络在国内外的研究现状;讨论了三维片上网络研究中的关键问题,归纳出网络拓扑结构、路由机制、性能评估、通信容错、功耗、映射、测试、交换技术、服务质量、流量控制、资源网络接口等12类研究课题;分类综述了关键问题的研究进展;分析了三维片上网络存在的问题;指出,在三维片上网络拓扑结构方面:个性化拓扑结构设计、仿真平台研究开发、基于新型拓扑结构的三维芯片样片试制以及无线技术的引入等,在路由算法方面:适合3D Torus的路由算法、结合无关路由与自适应路由算法优点的新路由算法、适合各种新型拓扑结构的高效路由算法等,在性能评估方面:永久故障的容错、改进仿真程序增加对物理链路的建模、充分考虑通信的局部性等,在功耗方面:对拓扑结构/映射算法/路由算法和布局进行综合优化、动态和静态控制相结合、更为精确的3D NoC功耗模型等,在映射方面:发热均匀性、动态路由策略下映射评估模型的优化、低功耗映射算法、基于优化算法的组合映射等,都将是三维片上网络未来的重要研究课题.  相似文献   

17.
In IEEE 802.16 based wireless mesh networks (WMNs), TDMA (Time Division Multiple Access) is employed as the channel access method and only TDD (Time Division Duplex) is supported and there are no clearly separate downlink and uplink subframes in the physical frame structure. As the uplink and downlink traffic has different characteristics in that the uplink traffic decentralizes in each MSS (Mesh Subscriber Station) and the downlink traffic centralizes in the MBS (Mesh Base Station), different scheduling methods should be taken in the uplink and downlink. This paper presents a uniform slot allocation algorithm which is suitable for both uplinks and downlinks. To achieve higher spatial reuse and greater throughput and to avoid switching frequently between receiving and transmitting within two adjacent time slots when a relay node forwards traffic, different link selection criteria are taken into account when allocating slots for uplinks and downlinks. A combined uplink and downlink slot allocation algorithm is proposed for further improving the spatial reuse and network throughput. The proposed algorithms are evaluated by extensive simulations and the results show that it has good performance in terms of spatial reuse and network throughput. To the best of the authors’ knowledge, this work is the first one that considers combined uplink and downlink slot allocation on the centralized scheduling scheme in IEEE 802.16 based WMNs.  相似文献   

18.
韩国栋  孔峰  沈剑良 《计算机应用》2014,34(10):2761-2765
针对较大规模片上网络(NoC)远端节点和邻近节点之间的通信问题,提出一种基于区域划分的层次化簇状分层网(CHM)结构。在此基础上,针对中间节点拥塞严重导致网络性能降低的问题,提出一种基于源区域路径选择的自适应算法。该算法利用CHM结构区域特性将路由决策由源节点移至源区域,同时在原有底层和上层节点对的基础上增加自适应节点对,并增加该部分节点对路由选择性,从而缓解网络拥塞状况。仿真实验表明,与最短路径算法相比,在合成流量和局部化流量模式下,该算法下的CHM结构饱和注入率最多可分别提升约51%和31%,因此该算法可有效提升网络整体吞吐性能。  相似文献   

19.
When a number of applications simultaneously running on a many-core chip multiprocessor (CMP) chip connected through network-on-chip (NoC), significant amount of on-chip traffic is one-to-many (multicast) in nature. As a matter of fact, when multiple applications are mapped onto an NoC architecture with applicable traffic isolation constraints, the corresponding sub-networks of these applications are mapped onto actually tend to be irregular. In the literature, multicasting for irregular topologies is supported through either multiple unicasting or broadcasting, which, unfortunately, results in overly high power consumption and/or long network latency. To address this problem, a simple, yet efficient hardware-based multicasting scheme is proposed in this paper. First, an irregular oriented multicast strategy is proposed. Literally, following this strategy, an irregular oriented multicast routing algorithm can be designed based on any regular mesh based multicast routing algorithm. One such algorithm, namely, Alternative Recursive Partitioning Multicasting (AL + RPM), is proposed based on RPM, which was designed for regular mesh topology originally. The basic idea of AL + RPM is to find the output directions following the basic RPM algorithm and then decide to replicate the packets to the original output directions or the alternative (AL) output directions based on the shape of the sub-network. The experiment results show that the proposed multicast AL + RPM algorithm can consume, on average, 14% and 20% less power than bLBDR (a broadcasting-based routing algorithm) and the multiple unicast scheme, respectively. In addition, AL + RPM has much lower network latency than the above two approaches. To incorporate AL + RPM into a baseline router to support multicasting, the area overhead is fairly modest, less than 5.5%.  相似文献   

20.
We address routing in Networks-On-Chip (NoC) architectures that use irregular mesh topologies with Long-Range Links (LRL). These topologies create difficult conditions for routing algorithms, as standard algorithms assume a static, regular link structure and exploit the uniformity of regular meshes to avoid deadlock and maintain routability. We present a novel routing algorithm that can cope with these irregular topologies and adapt to run-time LRL insertion and topology reconfiguration. Our approach to accommodate dynamic topology reconfiguration is to use a new technique that decomposes routing relations into two stages: the calculation of output ports on the current minimal path and the application of routing restrictions designed to prevent deadlock. In addition, we present a selection function that uses local topology data to adaptively select optimal paths.The routing algorithm is shown to be deadlock-free, after which an analysis of all possible routing decisions in the region of an LRL is carried out. We show that the routing algorithm minimises the cost of sub-optimally placed LRL and display the hop savings available. When applied to LRLs of less than seven hops, the overall traffic hop count and associated routing energy cost is reduced. In a simulated 8 × 8 network the total input buffer usage across the network was reduced by 6.5%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号