首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
衡霞  支亚军  韩俊刚 《计算机科学》2013,40(Z6):220-222
在研究片上网络服务质量的基础上,提出面向多处理器的64核片上网络结构。IP单元产生不同类型的数据包,网络提供优先级别服务,以保证高优先级数据包的低延时需要。性能统计结果表明,该模型对多处理器之间不同类型的数据包通信均满足服务质量要求。  相似文献   

2.
片上网络互连拓扑综述   总被引:1,自引:0,他引:1  
随着器件、工艺和应用技术的不断发展,片上多处理器已经成为主流技术,而且片上多处理器的规模越来越大、片内集成的处理器核数目越来越多,用于片内处理器核及其它部件之间互连的片上网络逐渐成为影响片上多处理器性能的瓶颈之一。片上网络的拓扑结构定义网络内部结点的物理布局和互连方法,决定和影响片上网络的成本、延迟、吞吐率、面积、容错能力和功耗等,同时影响网络路由策略和网络芯片的布局布线方法,是片上网络研究中的关键之一。对比了不同片上网络的拓扑结构,分析了各种结构的性能,并对未来片上网络拓扑研究提出建议。  相似文献   

3.
片上二维网络互连性能分析   总被引:1,自引:1,他引:1  
片上互连网络已日益成为影响片上多处理器性能的重要因素之一.几乎所有的互连结构均是在二维网络的基础上演变发展而来的.首先分析了几种常见的内部结点度均为4的二维网络的静态特性,提出了一种新的二维片上网络互连路由结构和通信协议,基于全局均匀随机通信模型,通过改变网络规模和变换通信强度,分析了不同结构网络的动态特性,然后用链接数表示通信成本,提出了一种新的网络互连综合性能评估指标网络单位成本延迟负载能力,最后对二维网络片上互连的综合性能进行了对比分析,指出了其各自适用的场合.  相似文献   

4.
随着器件、工艺和应用技术的不断发展,片上多处理器已经成为主流技术,而且片上多处理器的规模越来越大、片内集成的处理器核数目越来越多,用于片内处理器核及其它部件之间互连的片上网络逐渐成为影响片上多处理器性能的瓶颈之一.片上网络的拓扑结构定义网络内部结点的物理布局和互连方法,决定和影响片上网络的成本、延迟、吞吐率、面积、容错...  相似文献   

5.
在片上网络NoC( Network-on-Chip)中,通过光通信取代传统的电信号传精来获得低延时、低功耗成为一种新兴的研究方向—光五连片上网络ONoC(Optical Network-on-Chip)本文提出一种全新的双向传输的波长路由片上网络,这种新的结构对调制好的光信号的波长进行判断来实现在网络节点之间的路由,同时还能够通过器件和传输通道的共享实现数据的双向传输.和传统的电信号传输网络相比,本文提出的双向传输结构减少了50%的硬件开销和70%的芯片面积开销,提高了器件利用率,降低了网络传输延时,极大地提高了网络传精性能,对于光互连片上网络具有重要意义.  相似文献   

6.
分级环片上网络互连   总被引:1,自引:0,他引:1  
在大规模、超大规模片上互连网络中,因为二维互连方式的性能较差而使多维互连方式成为可选方案之一.文中首先基于区域划分设计了一种分级环互连结构,分析了其静态互连特性,然后基于卡诺图编码设计了一种分级环互连的路由结构以及寻径方法,在均匀通信模式测试了不同的分级环级联链路缓冲区设置方法下网络的性能,详细分析了按照等比序列设置分级环级联链路缓冲区时分级环互连方式的动态网络特性,最后根据互连性能与Mesh等二维片上互连方式比较的结果,给出了分级环互连方式的使用场合.实验结果表明,虽然在较小规模网络中性能较差,但是分级环互连方式能以较低的成本、较高的性能实现大规模、超大规模片上网络的互连,其中单环分级互连方式在较低网络负载下综合性能更好,而双环分级互连方式则具有更大的网络负载能力,在较高网络负载下性能更好.  相似文献   

7.
片上网络(Network on Chip, NoC)作为解决众核芯片互连的主流方案,其性能很大程度上取决于网络的拓扑结构。而网络拓扑结构的效能受到网络路由器的直接影响。因此,基于特定拓扑结构的路由器设计实现具有非常重要的研究意义。因此将XY路由算法应用于路由器节点中,设计了基于2D Mesh拓扑结构、轮询仲裁机制与虫孔交换流控的片上网络路由器,并使用Modelsim对路由器进行了功能验证。实验结果表明,设计的路由器能满足微片数据的处理,能够正确的收发数据包。  相似文献   

8.
支持服务质量的片上网络路由器设计   总被引:1,自引:0,他引:1  
系统芯片的复杂应用使得片上互连成为系统性能的瓶颈,因此出现了以片上网络为核心的通信结构,而路由器是片上网络的关键部件,它完成数据在片上网络拓扑结构上的传输.设计了支持服务质量的片上网络路由器.采用面向连接的细粒度数据交换方式为确保通信服务提供严格的端对端延迟需求,采用无连接的数据交换方式支持尽力而为通信服务,同时采用均衡片上通信负载的路由算法,有效地提高了平均通信性能.  相似文献   

9.
提出一种基于参数的层次化Mesh互连片上网络结构—PHNoC,解决片上网络规模扩张引起的通信延迟和吞吐性能下降问题。采用分簇多层次互连的思想,提高片上网络扩展性和连通性;引入层数和分簇类型参数,实现不同网络规模的灵活配置;引入跨层流控参数,控制并平衡层间负载流量。仿真试验表明,在多种流量模式下,不同网络规模时,PHNoC结构的延迟和吞吐性能相比传统的平面或两层结构优势明显,而资源开销和实现复杂度增加不大,说明增加多层互连资源可有效换取通信性能的提高。  相似文献   

10.
网络互连多线程处理器   总被引:1,自引:0,他引:1  
结合可扩展的片上互连网络和隐藏延迟的同时多线程结构,论文提出网络互连多线程(NMT,NetworkedMulti-threaded)处理器结构;在SMTSIM仿真器的基础上进行仿真,结果表明NMT结构具有较好的可扩展性和并行性,并提出了对其片上互连网络的性能要求。  相似文献   

11.
针对传统片上系统设计同步时钟引起的功耗大、IP核可重用性差等缺点,提出一种可用于多核片上系统和片上网络的快速延时无关同异步转换接口电路.接口由采用门限门的环形FIFO实现,移除了同步时钟,实现了数据从同步时钟模块到异步模块的高速传输,支持多种数据传输协议并保证数据在传输中延时无关.基于0.18μm标准CMOS工艺的Spice模型,对3级环形FIFO所构成的传输接口电路进行了仿真,传输接口的延时为613ps,每响应一个传输请求的平均能耗为3.05pJ?req,可满足多核片上系统和片上网络芯片速度高、功耗低、鲁棒性强和重用性好的设计要求.  相似文献   

12.
Future chip-multiprocessors (CMP) will integrate many cores interconnected with a high-bandwidth and low-latency scalable network-on-chip (NoC). However, the potential that this approach offers at the transport level needs to be paired with an analogous paradigm shift at the higher levels. In particular, the standard shared-memory programming model fails to address the requirements of scalability of the many-core era. Fast data exchange among the cores and low-latency synchronization are desirable but hard to achieve in practice due to the memory hierarchy. The message-passing paradigm permits instead direct data communication and synchronization between the cores. The shared-memory could still be used for the instruction fetch. Hence, we propose a hybrid approach that combines shared-memory and message passing in a single general-purpose CMP architecture that allows efficient execution of applications developed with both parallel programming approaches. Cores fetch instructions from a hierarchical memory and exchange their data through the same memory, for compatibility with existing software, or directly through the fast NoC. We developed a fast SystemC based cycle-accurate simulator for design space explorations that we used to evaluate the performance with real benchmarks. The various components have been RTL coded and mapped to a CMOS 45 nm technology to build a silicon area model that we used to select the best architectural configurations.  相似文献   

13.
Large-scale chip-multiprocessors (CMPs) need a scalable communication structure characterized by low cost, low power, and high performance to meet their on-chip communication requirements. This paper presents a hybrid circuit-switched (HCS) network for on-chip communication in the large-scale CMPs. The HCS network, which is Advanced Microcontroller Bus Architecture (AMBA) compatible, is composed of bufferless switches, pipeline channels, and network interfaces. Furthermore, packets are transferred in a hybrid transmission scheme. If a message has only one packet, the transmission scheme for this message is packet switching. Conversely, if a message contains multiple packets, the transmission scheme for this message is circuit switching. We evaluate HCS networks with different channel depths and then compare the HCS network with the Stanford elastic buffer (EB) network. Our results show that the HCS network with two-depth channel requires 83% less power and occupies 32% less area compared with the EB network. Furthermore, under maximum frequency and single traffic, the HCS network with two-depth channel provides 37% lower zero-load latency, 390% higher maximum throughput per unit power, and 19% higher maximum throughput per unit area compared with the EB network.  相似文献   

14.
Although significant research has been undertaken to reduce high level energy consumption in a data centre, there has been very little focus on reducing storage drive energy consumption via the intelligent allocation of workload commands at the file system level. This paper presents a method for optimising drive energy consumption within a custom built storage cluster containing multiple drives, using multi-objective goal attainment optimization. Significantly, the model developed was based on actual power consumption values (from current/voltage sensors on the drives themselves), which is rare in this field.The results showed that command energy savings of up to 87% (17% overall energy) could be made by optimising the allocation of incoming commands for execution to drives within a storage cluster for different workloads. More significantly, the transparency of the method meant that it showed exactly how such savings could be made and on which drives. It also highlighted that whilst it is well known that solid state drives use less energy than traditional hard disk drives, the difference is not consistent for different sizes of data transfers. It is far larger for small data transfers (less than or equal to 4 kB) and our algorithm utilised this.Significantly, it highlights how much larger energy savings can be made through using the optimisation results to show which drives can be safely put into a low power state without affecting storage cluster performance.  相似文献   

15.
工业实时控制中大量的数据传输和远端控制对数据传输速度提出了极高的要求,本文提出了一种以高速串/并转换器件IMS C011为核心的可满足工业实时控制的高速接口电路,数据传输速率在使用光缆时可达20Mbits/Sec,本文以此接口电路为依据,实现了工业两相流测量过程层析成象系统(电阻层析成象系统和电磁层析成象系统)的连续控制和数据采集,获得较高的数据采集速率。  相似文献   

16.
网络逻辑存储系统研究   总被引:7,自引:0,他引:7  
本文讨论了一个新的网络逻辑存储系统结构,此结构决PC机上高速数据下载时主机负载过重的问题及提高系统的带宽。同时,网络逻辑存储的实现方式和一个原型系统也在本文被讨论  相似文献   

17.
嵌入式系统TCP/IP协议性能测试与分析   总被引:2,自引:0,他引:2       下载免费PDF全文
陈辉  陈虎  奚建清 《计算机工程》2007,33(21):99-101
TCP/IP协议是网络传输的基础协议,嵌入式系统TCP/IP协议的性能测试对相关产品的研发具有重要的意义。该文通过对PC和嵌入式系统PXA255、AT9200的链接进行数据包发送测试,分析了嵌入式TCP/IP协议的各种开销,获得优化的服务器数据传输尺寸。估计了嵌入式系统所能达到的最大TCP/IP传输速度,并找出影响TCP/IP性能的关键因素和瓶颈。  相似文献   

18.
OFDM水声通信信道估计技术研究   总被引:1,自引:0,他引:1  
水声信道是一种极其复杂多变的时—空—频变信道,其信道窄、强多径干扰、信号起伏衰落严重,一直是水下信息可靠高速传输的主要障碍。正交频分复用(OFDM)是近年来数字通信中流行的一种并行传输新技术,其核心思想是将整个可用频带分割成多个正交子信道,将待传输的高速串行码流并行地调制在这些子信道载波上。主要研究利用三种不同的导频图案对OFDM水声通信信道进行估计,并通过仿真的方式来分析不同导频图案下信道估计的性能。  相似文献   

19.
In order to make data exchange speed fast enough for supporting the current communication systems or networks, a high speed switching system with low transmission delay and low data loss is required. Many researchers used statistical time division multiplexing techniques to design the switching system for achieving a higher throughput. In such switching systems with n input/output ports, the internal execution speed must be n times faster than the speed of the system with single input/output port. This designing philosophy is really not an appropriate way as the demand trend for higher speed system in the future.For improving the drawbacks of the switching system mentioned above, a novel, revolutionary architecture of a Parallel Input Parallel Output Register Switching System (PIPORS) is proposed in this paper. The PIPORS is based on the interconnection of the small distributed Shared Memory Modules (SMM) and the Shift Register Switch Array (SRSA). This construction will accelerate the switching speed. In addition, the number of input/output ports of the system can easily be extended for providing a higher capacity to respond to the trend of fast increasing amount of data transferred in the system. Three simple methods to extend the input/output ports and the capacity of the internal memory are presented.For evaluating the performance of the proposed system, we made some performance comparisons among our PIPORS and Central Shared Memory Switching System (CSMS) with respect to the amount of total memory required, data loss probability, transmission delay and switching performance. It shows that a better performance can be achieved in our PIPORS.  相似文献   

20.
The simulation of interconnect architectures can be a time-consuming part of the design flow of on-chip multiprocessors. Accurate simulation of state-of-the art network-on-chip interconnects can take several hours for realistic application examples, and this process must be repeated for each design iteration because the interactions between design choices can greatly affect the overall throughput and latency performance of the system. This paper presents a series of network-on-chip transaction-level model (TLM) algorithms that provide a highly abstracted view of the process of data transmission in priority preemptive and non-preemptive networks-on-chip, which permit a major reduction in simulation event count. These simulation models are tested using two realistic application case studies and with synthetic traffic. Results presented demonstrate that these lightweight TLM simulation models can produce latency figures accurate to within mere flits for the majority of flows, and more than 93% accurate link dynamic power consumption modelling, while simulating 2.5 to 3 orders of magnitude faster when compared to a cycle-accurate model of the same interconnect.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号