首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Design Trade-offs in Customized On-chip Crossbar Schedulers   总被引:1,自引:0,他引:1  
In this paper, we present a design and an analysis of customized crossbar schedulers for reconfigurable on-chip crossbar networks. In order to alleviate the scalability problem in a conventional crossbar network, we propose adaptive schedulers on customized crossbar ports. Specifically, we present a scheduler with a weighted round robin arbitration scheme that takes into account the bandwidth requirements of specific applications. In addition, we propose the sharing of schedulers among multiple ports in order to reduce the implementation cost. The proposed schedulers arbitrate on-demand (at design time) interconnects and adhere to the link bandwidth requirements, where physical topologies are identical to logical topologies for given applications. Considering conventional crossbar schedulers as reference designs, a comparative performance analysis is conducted. The hardware scheduler modules are parameterized. Experiments with practical applications show that our custom schedulers occupy up to 83% less area, and maintain better performance compared to the reference schedulers.  相似文献   

2.
Nanowire crossbar is one of the most promising circuit solutions for nanoelectronics. However, it is still unclear whether or how they can be competitive in implementing logic circuits, as compared to their MOSFET counterparts. We analyze nanowire crossbars in area, speed, and power, in comparison with their MOSFET counterparts. We show nanowire crossbars do not scale well in terms of logic density and speed. To achieve performance close to their MOSFET counterparts, crossbar circuits need faster field-effect transistors (FETs) to compensate the high resistance of nanowires. Motivated by the analysis and comparative study, we propose a crossbar cells design based on judicious use of silicon nanowires. The crossbar cell is compatible with the conventional MOSFET fabrication and design methodologies, in particular, standard cell-based integrated circuit design. We evaluate logic circuits synthesized with crossbar cells and MOSFET cells based on the MCNC91 benchmark. The results show that crossbar cells can provide a density advantage of more than four times over the traditional MOSFET circuits with the same process technology, while achieving close performance and consuming less than one third power.   相似文献   

3.
《Microelectronics Journal》2014,45(11):1533-1541
Crossbar array is a promising nanoscale architecture which can be used for logic circuit implementation. In this work, a graphene nanoribbon (GNR) based crossbar architecture is proposed. This design uses parallel GNRs as device channel and metal as gate, source and drain contacts. Schottky-barrier type graphene nanoribbon field-effect transistors (SB-GNRFETs) are formed at the cross points of the GNRs and the metallic gates. Benchmark circuits are implemented using the proposed crossbar, Si-CMOS and multi-gate Si-CMOS approaches to evaluate the performance of the crossbar architecture compared to the conventional CMOS logic design. The compact SPICE model of SB-GNRFET was used to simulate crossbar-based circuits. The CMOS circuits are also simulated using 16 nm technology parameters. Simulation results of benchmark circuits using SIS synthesis tool indicate that the GNR-based crossbar circuits outperform conventional CMOS circuits in low power applications. Area optimized cell libraries are implemented based on the asymmetric crossbar architecture. The area of the circuit can be more reduced using this architecture at the expense of higher delay. The crossbar cells can be combined with CMOS cells to exhibit better performance in terms of EDP.  相似文献   

4.
Topology/Floorplan/Pipeline Co-Design of Cascaded Crossbar Bus   总被引:1,自引:0,他引:1  
On-chip bus design has a significant impact on the die area, power consumption, performance and design cycle of complex system-on-chips (SoCs). Especially, for high frequency systems having on-chip buses pipelined extensively to cope with long wire delay, a naive bus design may yield a significant area/power cost mostly due to bus pipeline cost. The topology, floorplan, and pipeline are the most important design factors that affect the cost and frequency of the on-chip bus. Since they are strongly correlated with each other, it is imperative to codesign all of the three. In this paper, we present an automated codesign method for cascaded crossbar bus design. We present CADBUS (CAscadeD crossbar BUS design tool), an automated tool for AXI-based cascaded crossbar bus architecture design. The primary objective of this study is to design a cascaded crossbar bus, including the topology/floorplan/bus pipelines, having minimum area/power cost while satisfying the given constraints of communication bandwidth/latency or frequency. Experimental results of the three industrial strength SoCs show that, compared to the existing approach, the proposed method gives as much as 11.6%–34.2% (9.9%–33.5%) savings in bus area (power consumption).   相似文献   

5.
朱艳 《电子技术》2010,47(3):56-58
在多核CPU中,当多个处理器核心需要和存储器及输入输出口进行数据存取时就会导致竞争问题,此时传统的总线将会降低系统的性能。而采用CPU-Cache交叉开关无阻塞网络实现点对点的传输则在很大程度上解决了这一问题。本文对交叉开关与传统的共享总线做出比较,并对交叉开关进行全定制电路设计。  相似文献   

6.
Boxer  A. 《Spectrum, IEEE》1995,32(2):41-45
No aspect of computer design is sacred, not even the system bus, which is giving way to switches in multiprocessing systems, where performance is key. System buses, which string computers together out of circuit boards, have come to strangle system performance in many cases. Another interconnection architecture, though, can free a system from the bus's clutch. Variously known as a switch, crossbar, or crosspoint, it has long been used in speciality computers and is now making its way into lower-cost machines. Meanwhile, silicon and packaging technology have been refined to the point that the crossbar architecture can vie with the system bus for a place in low-cost multiprocessors. More specifically, the crossbar is well suited to use in distributed memory systems, where there is a need for broad path ways for communications between the chunks of memory themselves. The roots of such an approach go deep. In fact, it may be said to have started with an idea for keeping as much data traffic as possible out of general circulation: cache memory  相似文献   

7.
宏SIMD短向量管理部件是高性能通用微处理器和媒体处理器的重要部件。文章提出一种基于交叉开关的宏SIMD短向量管理部件设计,用于音频、视频和网络通信等多媒体数据处理,克服了传统SIMD体系结构中的数据结构与系统硬件不匹配的问题,满足了下一代高性能计算的要求。  相似文献   

8.
交叉开关是片上网络路由器的关键部分。交叉开关的设计可以采用三态触发器或多路复用器实现。本文针对几种不同形式的交叉开关实现方案,比较了其面积和功耗的开销,同时设计了基于iSLIP算法的交叉开关调度机制。通过基本逻辑门搭建的多路复用器实现的交叉开关相比于采用三态门实现的交叉开关,在功耗、面积上有较大优势。采用iSLIP算法实现的片上网络交叉开关,具有最高的工作频率上限。  相似文献   

9.
We present the design and experimental demonstration of a 5-b serial-to-parallel decoder for a crossbar application. A serial train of seven bits is provided at the input with the first five being the code for selecting one of 32 output lines. The last two constitute the code that determines if the selected decoder in the crossbar switch should generate an output. Several circuit innovations were needed to meet the severe restrictions on power, current, and area required for the crossbar application. Operation of the circuit was demonstrated at 2 Gb  相似文献   

10.
Multi-FPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture which is the manner in which wires, FPGAs and field-programmable interconnect devices (FPIDs) are connected. Several routing architectures for MFSs have been proposed, and previous research has shown that the partial crossbar is one of the best existing architectures. In this paper, we propose a new routing architecture, called the hybrid complete-graph and partial-crossbar (HCGP) which has superior speed and cost compared to a partial crossbar. The new architecture uses both hard-wired and programmable connections between the FPGAs. We compare the performance and cost of the HCGP and partial crossbar architectures experimentally, by mapping a set of 15 large benchmark circuits into each architecture. A customized set of partitioning and interchip routing tools were developed, with particular attention paid to architecture-appropriate interchip routing algorithms. We show that the cost of the partial crossbar (as measured by the number of pins on all FPGAs and FPIDs required to fit a design), is on average 20% more than the new HCGP architecture and as much as 25% more. Furthermore, the critical path delay for designs implemented on the partial crossbar were on average 20% more than the HCGP architecture and up to 43% more. Using our experimental approach, we also explore a key architecture parameter associated with the HCGP architecture-the proportion of hard-wired connections versus programmable connections-to determine its best value  相似文献   

11.
MPLS交换路由器的设计与实现   总被引:2,自引:0,他引:2  
MPLS是目前Internet核心网上最看好的技术,基于Crossbar的高速交换技术的发展也很快,但是如何实现2种技术的有机结合以到达最优的性能却有一系列问题需要考虑。本文结合某MPLS路由器的设计项目,综合考虑了当前MPLS的标准以及有关交换结构的最新技术,对实现MPLS交换路由器进行了深入研究,并详细探讨了高速接口、排除和调度以及Crossbar等关键模块的设计和实现方案。  相似文献   

12.
利用可变光栅模液晶光学传感器设计了一种光学十字开关互联网络系统,该系统可实时地重新组合光学十字开关互联,液晶光学传感器执行强度-空间频率转换。讨论了这种互联网络系统的特性和最优化问题。  相似文献   

13.
Memristive device based passive crossbar arrays hold a great promise for high-density and non-volatile memories. A significant challenge of ultra-high density integration of these crossbars is unwanted sneak-path currents. The most common way of addressing this issue today is an integrated or external selecting device to block unwanted current paths. In this paper, we use a memristive device with intrinsic rectifying behavior to suppress sneak-path currents in the crossbar. We systematically evaluate the read operation performance of large-scale crossbar arrays with regard to read margin and power consumption for different crossbar sizes, nanowire interconnect resistances, ON and OFF resistances, rectification ratios under different read-schemes. Outcomes of this study allow improved understanding of the trade-off between read margin, power consumption and read-schemes. Most importantly, this study provides a guideline for circuit designers to improve the performance of oxide-based resistive memory (RRAM) based cross-point arrays. Overall, self-rectifying behavior of the memristive device efficiently improves the read operation performance of large-scale selectorless cross-point arrays.  相似文献   

14.
This paper presents a finite state analytical model and supporting simulation for performance analysis of a partially blocking, packet-switched, multistage communication network whose crossbar switches are output queued, non-lossy, and have an internal bandwidth (BW) such that 1⩽BW⩽a, where a is the number of inputs to the crossbar. To the knowledge of the authors, this is the only analytical model in the current literature that addresses this problem without making at least one of the following simplifying assumptions: (1) infinite number of inputs, (2) infinite number of buffers, (3) BW=a, (4) use of only a single crossbar (as opposed to multiple stages). The analytical model presented herein gives a set of closed-form equations which lead to an iterative solution for normalized bandwidth and normalized delay. The model provides results which are quite accurate (as shown by simulation) over a large range of parameter values (e.g., crossbar size, number of buffers in each queue, etc)  相似文献   

15.
The buffered crossbar switch is a promising switching architecture that plays a crucial role for providing quality of service (QoS) in computer networks. Sufficient amount of resources—bandwidth and buffer space—must be allocated in buffered crossbar switches for QoS provision. Resource allocation based on deterministic QoS objectives might be too conservative in practical network operations. To improve resource utilization in buffered crossbar switches, we study the problem of resource allocation for statistical QoS provision in this paper. First, we develop a model and techniques for analyzing the probabilistic delay performance of buffered crossbar switches, which is described by the delay upper bound with a prescribed violation probability. Then, we determine the required amounts of bandwidth and buffer space to achieve the probabilistic delay objectives for different traffic classes in buffered crossbar switches. In our analysis, we apply the effective arrival envelope to specify traffic load in a statistical manner and characterize switch service capacity by using the service curve technique. Instead of just focusing on one specific type of scheduler, the model and techniques developed in this paper are very flexible and can be used for analyzing buffered crossbar switches with a wide variety of scheduling algorithms. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

16.
该文针对与非锥(And-Inverter Cone, AIC)簇架构FPGA开发中面临的簇面积过大的瓶颈问题,对其输入交叉互连设计优化进行深入研究,在评估优化流程层次,首次创新性提出装箱网表统计法对AIC簇输入和反馈资源占用情况进行分析,为设计及优化输入交叉互连结构提供指导,以更高效获得优化参数。针对输入交叉互连模块,在结构参数设计层次,首次提出将引脚输入和输出反馈连通率分离独立设计,并通过大量的实验,获得最优连通率组合。在电路设计实现层次,有效利用AIC逻辑锥电路结构特点,首次提出双相输入交叉互连电路实现。相比于已有的AIC簇结构,通过该文提出的优化方法所得的AIC簇自身面积可减小21.21%,面积制约问题得到了明显改善。在实现MCNC和VTR应用电路集时,与Altera公司的FPGA芯片Stratix IV(LUT架构)相比,采用具有该文所设计的输入交叉互连结构的AIC架构FPGA,平均面积延时积分别减小了48.49%和26.29%;与传统AIC架构FPGA相比,平均面积延时积分别减小了28.48%和28.37%,显著提升了FPGA的整体性能。  相似文献   

17.
一种雷达信号处理系统新体系结构的设计   总被引:2,自引:2,他引:0  
何宾  汪晓男 《现代雷达》2004,26(10):27-31
为了适应雷达信号处理大带宽的要求 ,现代雷达信号处理系统结构中广泛采用了基于开关的、点对点的互连结构。基于互连开关结构的信号处理系统具有可扩展性好、性能优越、成本较低的优点。由于采用低电压差分传输 ,在交叉开关之间的数据传输速率可以达到G比特 ,它成为今后先进雷达信号处理系统发展的方向。依据RapidIO互连协议规范 ,文中提出了基于多数字信号处理器和开关互连的雷达信号处理系统体系结构 ,并对其中的交叉开关模型及其性能进行了分析 ,最后对该信号处理系统的软件和硬件实现方法进行了讨论。  相似文献   

18.
In this work, a new technologic strategy that allows implementing large crossbars formed with memFETs, a new device concept, is introduced. This memFET is an electrically reconfigurable field effect and resistive switching device that can be used to implement logic functions and memory blocks into a crossbar structure, allowing the dynamic logic configuration of the crossbar and simplifying both the design and the implementation of computing hardware. Moreover, taking the advantage of reconfiguration capability of such a technology and architecture we introduce a novel technique to design evolvable hardware where not only the logic functions are changeable (as is the case of the Field Programmable Gate Array, FPGA) but also the physical position of the components on the surface of the integrated circuit. This technology and principle leads towards a new computing paradigm based on what we name Shape Shifting Digital Hardware (SSDH).  相似文献   

19.
We have proposed 3 nanoarchitectures with carbon nanotube-based nano-electromechanical systems (CNT-NEMS) switch with a floating gate. It is shown that logic based on them has the potential to replace CMOS using process technology of less than 45 nm. Furthermore, CNT-NEMS-based 3-D circuits realize extremely high bandwidth of over 10 petabyte/s with very low latency of less than several 10 ps. The most effective applications are 3D on-chip crossbar bus and future on-chip network, which will largely determine the performance of future microchips. The performance of 3-D on-chip crossbar based on CNT-NEMS is also compared with that based on CNT-transistors.  相似文献   

20.
Input-buffered replicated networks are considered for broadband switching applications. They are characterized by many design parameters such as the replication factor, the traffic management policy, and input buffer location and length. To show the influence of these parameters on switching performance, an analytical model is defined based on a Markov chain representation of the input buffer. This model is suitable for application to input buffered architecture having different routing network choices. The results, expressed in terms of throughput, packet delay, and packet loss probability, outline the performance improvements with respect to other well-known networks with input buffers, such as banyan and crossbar, reached through the flexibility offered by this architectural solution  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号