期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

丁毓良张剑贤周端裘雪红《计算机工程与科学》2017,39(2):275-279

为了提高片上网络在Torus拓扑结构下的路由通信效率,提出了一种基于直线引导思想的路由算法Tline。该路由算法将Torus拓扑结构的片上网络拓展为类似Mesh结构的坐标平面,以数据包的源节点和目的节点构成的直线为路由转发方向,并根据周围邻近节点的拥塞状况选择传输路径方向实现部分自适应路由。实验结果表明,与XY、OE路由算法相比,在热点流量模式下Tline路由算法具有较好的路由性能,且平均能耗降低约8%。相似文献

2.

片上网络中低延时可扩展的路由器结构设计 总被引：1，自引：0，他引：1

张媛媛孙光苏厉金德鹏曾烈光《传感器与微系统》2012,31(8):134-136

为了满足片上网络中路由器能同时支持多个IP核的要求,并同时具有较好的延时性能,设计了一种分布式路由和仲裁的路由器结构。其中的仲裁模块根据当前路由器各输入端口的请求状态和下一路由器相应输入端口缓冲器的状态进行仲裁,此仲裁方法提高了数据包传输的成功率,从而降低了传输延时,使路由器具有良好的延时性能,同时仿真结果表明:该路由器在面积开销方面具有良好的可扩展性。相似文献

3.

Performance evaluation of mesh-based NoCs: Implementation of a new architecture and routing algorithm

Sudhanshu Choudhary Shafi Qureshi 《国际自动化与计算杂志》2012,9(4):403-413

This paper presents the result of experiments conducted in mesh networks on different routing algorithms, traffic generation schemes and switching schemes. A new network on chip (NoC) topology based on partial interconnection of mesh network is proposed and a routing algorithm supporting the proposed architecture is developed. The proposed architecture is similar to standard mesh networks, where four extra bidirectional channels are added which remove the congestion and hotspots compared to standard mesh networks with fewer channels. Significant improvement in delay (60% reduction) and throughput (60% increase) was observed using the proposed network and routing when compared with the ideal mesh networks. An increase in number of channels makes the switches expensive and could increase the area and power consumption. However, the proposed network can be useful in high speed applications with some compromise on area and power. 相似文献

4.

三维片上网络正四面体裂变拓扑结构研究

郑亚振张大坤《计算机应用研究》2019,36(1)

旨在研究新型三维片上网络正四面体裂变拓扑结构,给出了该拓扑结构的生成过程;对该拓扑结构进行了编码设计和路由设计。通过对gpNoCsim片上网络仿真器进行三维扩展,对正四面体裂变拓扑结构进行性能仿真实验。仿真结果表明,在均匀负载模式下,正四面体裂变拓扑结构的平均延时和平均跳数均低于Mesh结构,当注入率为0.02时,平均延时比Mesh结构低16.8%、平均跳数比Mesh结构少5.5%;在局部负载模式下,当注入率大于0.008时,正四面体裂变拓扑结构的平均延时和平均跳数与Mesh结构相比,均有明显改善;当注入率为0.014时,平均延时比 Mesh结构降低18.7%、平均跳数比 Mesh结构减少9.6%。说明正四面体裂变拓扑结构可用于三维片上网络拓扑结构设计。相似文献

5.

一种用于片上网络的拥塞感知哈密尔顿最短路径路由算法

康子扬彭凌辉周干林博王蕾《计算机工程与科学》2022,44(6):986-993

类脑处理器能够支持多种脉冲神经网络SNN的部署来完成多种任务。片上网络NoC能够用较少的资源和功耗解决片上复杂的互连通信问题。现有的类脑处理器多采用片上网络来连接多个神经元核,以支持神经元之间的通信。SNN在时间步内瞬时突发的通信会在短时间内产生大量的脉冲报文。在这种通信行为下,片上网络会在短时间内达到饱和,造成网络拥塞。片上网络中非拥塞感知路由算法会进一步加剧网络拥塞状态,如何在每一个时间步内有效处理这些数据包,从而降低网络延迟,提高吞吐率,成为了目前需要解决的问题。首先对SNN的瞬时猝发通信特性进行了分析;然后提出一种拥塞感知的哈密尔顿路径路由算法,以降低NoC平均延迟和提高吞吐率;最后,使用Verilog HDL实现该路由算法,并通过模拟仿真进行性能评估。在网络规模为16×16的2D Mesh结构的片上网络中,相对于没有拥塞感知的路由算法,在数量猝发模式和概率猝发模式下,所提出的拥塞感知路由算法的NoC平均延迟分别降低了13.9%和15.9%;吞吐率分别提高了21.6%和16.8%。相似文献

6.

Packet synchronization for synchronous optical deflection-routedinterconnection networks

Feehrer J.R. Ramfelt L.H. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(6):605-611

Deflection routing resolves output port contention in packet switched multiprocessor interconnection networks by granting the preferred port to the highest priority packet and directing contending packets out other ports. When combined with optical links and switches, deflection routing yields simple bufferless nodes, high bit rates, scalable throughput, and low latency. We discuss the problem of packet synchronization in synchronous optical deflection networks with nodes distributed across boards, racks, and cabinets. Synchronous operation is feasible due to very predictable optical propagation delays. A routing control processor at each node examines arriving packets and assigns them to output ports. Packets arriving on different input ports must be bit wise aligned; there are no elastic buffers to correct for mismatched arrivals. “Time of flight” packet synchronization is done by balancing link delays during network design. Using a directed graph network model, we formulate a constrained minimization problem for minimizing link delays subject to synchronization and packaging constraints. We demonstrate our method on a ShuffleNet graph, and show modifications to handle multiple packet sizes and latency critical paths 相似文献

7.

计算机光互连网络Data Vortex拓扑的优化与改进

下载免费PDF全文

董连永窦强王志伟齐星云窦文华《计算机工程与科学》2009,31(8)

Data Vortex是一种新型的自路由多跳光分组交换网络。为了便于工程实现,采用圆柱形多级互连拓扑、同步时钟和偏射路由机制,避免了光缓存的使用,简化了路由逻辑。和传统的互连网络相比较,Data Vortex网络具有较高的可接受流量,但其延迟性能优势并不明显。本文采用理论分析和模拟实验的方法研究了Data Vortex角度参数对整个网络性能的影响,并对Data Vortex网络的拓扑参数进行了优化。同时,通过改进最内层交换节点的结构,使得Data Vortex网络具有更低的平均延迟。利用OMNeT++模拟了改进后的32×32Data Vortex网络在均匀负载下的性能,报文平均延迟降低了8.9%～16.5%。相似文献

8.

一种基于Chiplet集成技术的超高阶路由器设计

梁崇山戴艺徐炜遐《计算机工程与科学》2022,44(2):207-213

高带宽、低延迟的高阶路由器对于构建大规模可扩展的互连网络有着重要的作用,但是受限于单个路由芯片设计复杂度的不断增加以及摩尔定律、登纳德缩放定律的放缓与停滞,在单个路由芯片上扩展更多的端口数将变得越来越难。Chiplet将多个裸片以特定的方式集成在一个高级封装内,形成具有特定功能的大芯片,以此解决芯片设计中涉及的规模、研制成本和周期等方面的问题。根据Chiplet集成技术的思想,利用已有的路由芯片,提出了一种基于Chiplet的128端口高阶路由器,这种高阶路由器内部是一个由多个Switch Die以二层胖树拓扑构成的网络。通过实际的RTL级代码仿真测试,对比于单芯片的高阶路由器设计方式,所设计的路由器在扩展了更多端口数的同时,还能够达到较好的性能。相似文献

9.

二维网格片上网络中的新型自适应路由算法

下载免费PDF全文

肖灿文张民选赵志通《计算机工程与科学》2010,32(11):107-110

本文针对二维网格的片上网络设计了一种称为维度气泡流控(DBFC)的新型流控策略。利用虚跨步切换技术中消息的依存关系只与相邻缓冲区队列相关的特点,设计实现了维度气泡流控。该流控策略建立在虚跨步(VCT)切换和信约流控机制之上,通过分析端口信约值和路由信息实现点点间的流控。在二维网格的片上网络中采用DBFC流控策略,即使网络中存在环相关,本文设计的自适应维度气泡路由(ADBR)算法仍可实现无死锁、最短距离的路由。对于以上结论,本文提供了详细证明。最后,通过修改通用的片上网络模拟工具-NOXIM的代码,实现了DBFC流控策略和ADBR算法。在NOXIM上分析了ADBR算法的性能,结果显示ADBR算法拥有较好的性能。相似文献

10.

基于偏折路由的双环片上网络

齐星云戴艺赖明澈常俊胜董德尊《计算机工程与科学》2021,43(3):381-388

为了降低中等规模的片上网络设计复杂度,提高网络效率,提出了一种基于偏折路由的双环片上网络结构,研究了其冲突解决机制,给出了一种简单高效的路由算法,并采用硬件描述语言实现了该网络结构,构建了周期精确的网络性能模拟环境.仿真和实验结果表明,在中小规模网络环境以及网络负载不高(<40％)的情况下,这种双环网络结构在延时和吞吐... 相似文献

11.

一种避免拥塞的片上网络通信协议

李自迪蒋林李翠锦《小型微型计算机系统》2011,32(4)

为了平衡通信协议的自适应性和性能,提出一种避免拥塞的片上网络通信协议.该通信协议采用区分服务和自适应路由算法,区分服务提供不同等级数据流的质量保证型服务,自适应路由算法是一种避免拥塞的无死锁路由算法.通过OPNETM odeler建模仿真,结果表明该协议极大地改善了网络的平均链路利用率和端到端延迟. 相似文献

12.

基于拥塞控制的片上网络多播路由算法

袁景凌刘华谢威蒋幸《计算机应用》2011,31(10):2630-2633

为了满足片上网络日益丰富的应用要求,多播路由机制被应用到片上网络,以弥补传统单播通信方式的不足。以Mesh和Torus类的片上网络为例,分析了基于路径的3种多播路由算法(即XY路由、UpDown路由和SubPartition路由算法),并研究了相应的拥塞控制策略。通过模拟实验表明,多播较单播通信具有更小的平均传输延时和更高的网络吞吐量,且负载分配均匀;特别是SubPartition路由算法随着规模增大效果更加明显;提出的多播拥塞控制机制,能更有效地利用多播通信,提高片上网络的性能。相似文献

13.

Barrier synchronization on wormhole-routed networks

Yuzhong Sun Cheung P.Y.S. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(6):583-597

In this paper, we propose an efficient barrier synchronization scheme on networks with arbitrary topologies. We first present a distributed method in building a barrier routing tree. The barrier messages can be delivered adaptively according to the hierarchy of the established barrier tree to void congestion and faulty nodes in the network. We then propose a new technique, called bandwidth-preempting technique, for a blocked barrier message to preempt a channel occupied by a data message so that the latency of a barrier message can be controlled without affecting much of the overall system performance. We also propose an analytical performance model and present simulation results for the performance evaluation of the proposed scheme. Performance evaluations show that the proposed scheme outperforms the existing algorithms for barrier synchronization 相似文献

14.

SimTile:片状多核处理器的高效模拟器(英文)

下载免费PDF全文

刘涛季振洲王庆《计算机科学与探索》2010,4(12):1115-1120

传统的基于共享总线的多核芯片随着核心数增加产生了瓶颈问题。新型TiledCMP(chip multiprocessor)的结构设计中,片上核心互联网络对提高扩展能力和执行效率起到了重要作用。为了实现低延迟、高带宽的核心通信,高速点对点网络方式的片上多核互联结构模拟成为研究的热点。抽象片上Tiled方式16核功能单元结构,设计实现了SimTile模拟器,可提供配置灵活、功能单元齐全的片上多核处理器设计,支持高效率的全局共享缓存、高速片上路由结构。模拟器采用模块化的组件配置方式,片上核心数量与互联网络结构、数据一致性协议、全局寄存器通信与cache共享模式等,均可通过精简的参数调整。实验表明模拟器执行效率较高,为片上多核研究提供了灵活、高效并具备可扩展性的新平台。相似文献

15.

AS5040在角度测量中的应用 总被引：1，自引：0，他引：1

韩喜春吴东艳张鹏《传感器与微系统》2006,25(6):75-77

阐述了角度传感器芯片AS5040的工作原理、使用方法、硬件和软件设计,介绍了在太阳能电池充电控制系统中的应用。该芯片具有模拟和数字接口,可与磁钢、微处理器组成一体化智能传感器。实验表明:AS5040在该系统测量中的最大误差为1.8°,达到了设计要求,可广泛地应用在非接触角度测量领域。相似文献

16.

Design,implementation and evaluation of a deadlock-free routing algorithm for concurrent computers

M. Cannataro G. Spezzano D. Talia E. Gallizzi 《Concurrency and Computation》1992,4(2):143-161

This paper describes the design, the implementation, and the performance results of a routing algorithm which provides deadlock-free communication in a tightly coupled message-passing concurrent computer. The algorithm is adaptive, isolated and uses the store-and-forward technique. It allows message communication between two processes regardless of where they are physically located on the network. The routing algorithm has many positive characteristics including provable deadlock freedom, guaranteed message arrival, and automatic local congestion reduction. It can be used as a basis for the design of high-level communication primitives. An Occam implementation on a network of inmos Transputers is discussed. The experimental results show that the routing algorithm is effective to support process to process communication on a concurrent computer. 相似文献

17.

基于拥塞预测的NoC自适应仲裁方法*

杨盛光李丽徐懿张宇昂娄孝祥高明伦《计算机应用研究》2009,26(2):652-654

传统用于总线系统或互联网的仲裁方法已不能很好地适应NoC应用环境。围绕NoC系统性能的关键影响因素——拥塞状态,提出了一种基于全局和本地拥塞预测的仲裁策略(GLCA),以改善NoC网络延迟。实验结果表明,相对于RR方法,新仲裁算法使得网络平均包延迟和平均吞吐量最大分别可改善20.5%和8%,并且在不同负载条件下都保持了其优势。综合结果显示, GLCA与RR方法相比,路由器仅在组合逻辑上有少许增加(25.7%)。相似文献

18.

一种不规则2DMesh的NoC路由算法

徐欣王长山《计算机与现代化》2010,(5):111-114

NoC的网络拓扑结构是其研究的重要方面,在一些实际应用中,NoC系统通常集成多个不同功能、不同尺寸、不同通讯需求的组件,而规则的拓扑结构并不适应于在这种类型的NoC中应用,因此不规则Mesh网络被应用于不规则的NoC系统,为解决规则Mesh路由算法在不规则Mesh中无法保证路由连通性问题。提出一种不规则Mesh无死锁路由算法,同时此算法与其他算法相比,具有更少的虚通道和更优秀的路由路径选择。相似文献

19.

基于不规则Mesh的NoC无死锁路由

段新明杨愚鲁《小型微型计算机系统》2008,29(7)

网络拓扑的选择是NoC设计中的一个重要问题,目前典型的特定应用NoC系统通常集成多个不同功能、不同尺寸、不同通讯需求的组件,而规则的网络拓扑结构并不适于在这种类型的NoC中应用,因此不规则Mesh网络被提出并被应用于不规则结构的NoC系统.为解决规则 Mesh路由算法在不规则Mesh中无法保证路由连通性的问题,本文提出一种不规则Mesh无死锁路由算法,无论NoC系统集成组件的版图如何变化,这一算法始终是连通的,即算法与不规则Mesh的规模和结构是无关的,同时算法仅使用较低的虚拟通道. 相似文献

20.

Bi-LCQ: A low-weight clustering-based Q-learning approach for NoCs

F. Farahnakian M. Ebrahimi M. Daneshtalab P. Liljeberg J. Plosila 《Microprocessors and Microsystems》2014

Network congestion has a negative impact on the performance of on-chip networks due to the increased packet latency. Many congestion-aware routing algorithms have been developed to alleviate traffic congestion over the network. In this paper, we propose a congestion-aware routing algorithm based on the Q-learning approach for avoiding congested areas in the network. By using the learning method, local and global congestion information of the network is provided for each switch. This information can be dynamically updated, when a switch receives a packet. However, Q-learning approach suffers from high area overhead in NoCs due to the need for a large routing table in each switch. In order to reduce the area overhead, we also present a clustering approach that decreases the number of routing tables by the factor of 4. Results show that the proposed approach achieves a significant performance improvement over the traditional Q-learning, C-routing, DBAR and Dynamic XY algorithms. 相似文献