首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper purposes a bus architecture called skewed repeater bus (SRB) for reducing on-chip interconnect energy in microprocessors. By introducing a dynamic relative delay between neighboring bus lines, SRB reduces both average and worst-case coupling capacitance between those lines. SRB is compared to previously published techniques like delayed data bus (DDB) and delayed clock bus (DCB). Simulation results in 65-nm process show that bus energy reduction of 18% is achieved when SRB is applied to a real microprocessor example, versus 11% and 7% only for DDB and DCB, respectively.   相似文献   

2.
This paper presents an analysis of how the power dissipation of on-chip buses is affected by introducing a relative delay between the switching lines. Relative delay is shown to reduce the dissipated power of oppositely switching lines while causing a power penalty for similarly switching lines. A new low-power bus scheme that uses this effect is proposed and analyzed. As the introduced delay increases, the achieved power reduction increases while decreasing the bus throughput. Thus, a tradeoff between power reduction and throughput is required when selecting the imposed relative delay. The proposed low-power scheme, dynamic delayed line bus (DDL) scheme, led to a power reduction of up to 25%, 33%, and 42% when applied to data, address, and differential buses, respectively. Simple DDL hardware is designed and implemented in a 0.18-/spl mu/m TSMC CMOS technology and applied to a 4500-/spl mu/m long Metal4 bus. Circuit simulation results for different bus widths are presented.  相似文献   

3.
The data transfer speed of a microcomputer bus can be improved by adding an active circuit to the bus. This active circuit amplifies the bus voltage and feeds back to the bus a current which is proportional to the time rate of change of the bus voltage. This circuit effectively adds a negative capacitance to the bus. The practical capacitance canceling capability is limited by the propagation delay time of the operational amplifier in the active circuit. The theory of microcomputer bus structures with negative capacitance including effects of amplifier delay is presented. Typically, an operational amplifier with propagation delay less than one tenth of the bus time constant is required to achieve significant (factor of 2) bus speed improvement. High performance operational amplifiers were used to construct a working model of the negative capacitance bus terminator. The experimental results agree well with the theory.  相似文献   

4.
This paper shows the decreased effectiveness of on-chip decoupling capacitance in high-frequency operation. On-chip decoupling capacitance is often used to decrease the variation of the propagation delay caused by power/ground noise, i.e., dynamic IR-drop and/or delta-I noise. However, it is shown in this paper that decoupling capacitance is only effective for coping with dynamic IR-drop if the recharging time between switching events is sufficient. In other words, the effectiveness of decoupling capacitance for dynamic IR-drop in high-frequency operation is less than that of a fully-charged decoupling capacitor. The recharging time and the effectiveness of a decoupling capacitor depend on the propagation delay of the average circuit path which is used to determine the total switching current of a given macro/chip and clock cycle time. If the propagation delay of the critical paths is approximately equal to that of the average circuit path, then it is shown in this paper that adding decoupling capacitance never improves the maximum frequency of the system due to dynamic IR-drop limitations. On the other hand, if the propagation delay of the critical paths is larger than that of the average circuit path, then the maximum frequency is improved by adding decoupling capacitance. In both cases, a new metric, called the apparent capacitance, can be used to help make correct decisions about decoupling capacitance planning.  相似文献   

5.
As technology scales, the shrinking wire width increases the interconnect resistivity, while the decreasing interconnect spacing significantly increases the coupling capacitance. This paper proposes reducing the number of bus lines of the conventional parallel-line bus (PLB) architecture by multiplexing each m-bits onto a single line. This bus architecture, the serial-link bus (SLB), transforms an n-bit conventional PLB into an n/m-line (serial link) bus. The advantage of SLBs is that they have fewer lines, and if the bus width is kept the same, SLBs will have a larger line pitch. Increasing the line width has a twofold reduction effect on the line resistance; as the resistivity of sub-100 nm wires drops significantly, the line width increases. Also, increasing the line width and spacing reduces the coupling capacitance between adjacent lines, but increases the line-to-ground capacitance. Thus, an optimum degree of multiplexing m opt and an optimum width to pitch ratio etaopt exist, which minimizes the bus energy dissipation and maximizes the bus throughput per unit area. The optimum degree of multiplexing and optimum width-to-pitch ratio for maximum throughput per unit area and minimum energy dissipation for the 25-130-nm technologies was determined in this paper. Also, an encoding technique was proposed and implemented to reduce the switch activity penalty due to serialization. HSPICE simulations show that for the same throughput per unit area as conventional parallel-line data buses, the SLB architecture reduces the energy dissipation by up to 31% for a 64-bit bus implemented in an intermediate metal layer of a 50-nm technology, and a reduction of 53% is projected for a 25-nm technology.  相似文献   

6.
《Spectrum, IEEE》2003,40(2):36-40
The crux of the problem is the tiny metal wires that weave the transistors on today's chips into integrated circuits. In the most advanced ICs, transistors switch up to 10 billion times a second, and their metal interconnects can barely keep up. While interconnect delay times are stretching out, transistor switching is getting faster, sending more signals down slow lines. However, the industry thinks it is zeroing in on a solution: change the propagation characteristics of those tiny on-chip transmission lines. The line's capacitance is being lowered by changing the material that insulates it from the surrounding silicon chip as well as from neighboring wire. The capacitance depends on an insulator's dielectric constant, and so researchers are developing thin films that have a lower dielectric constant, or lower k, than the silicon dioxide insulating layer used most commonly up to now.  相似文献   

7.
This paper proposes a bus architecture which improves the performance and/or power dissipation of online buses. The proposed architecture reduces the delay on alternate lines by lowering the threshold voltage of its devices. Furthermore, the shifting of the signal switching on adjacent lines reduces the worst case coupling capacitance. Two implementations of this bus architecture are proposed, the alternate-$V_t$and the alternate forward body biased schemes, and are compared to a conventional bus scheme. For a flop distance of 1800$mu$m, the proposed schemes use the gained delay slack to reduce the total device width, and thus reducing the energy dissipation by 31.2%. For a 500-ps cycle time, the proposed bus schemes increase the maximum distance between flip-flops by 33%.  相似文献   

8.
Current VLSI design techniques focus on four major goals: higher integration, faster speed, lower power, and shorter time-to-market. These goals have been accomplished mainly by deep submicron (DSM) technology along with voltage scaling. However, scaling down of feature size causes larger interwire capacitance which results in large crosstalk between interconnects. In this paper, we propose a novel predictable circuit architecture, named "optimized overlaying array-based architecture" (O/sup 2/ABA), especially suited for the deep submicron regime. O/sup 2/ABA achieves reduction in crosstalk by considering the current directions and by reducing interwire capacitance. The introduction of "unit cell" leads to regularity, which makes the performance predictable even before layout, and shortens design time. O/sup 2/ABA is compared with other design styles, such as custom design and standard cell approach, in terms of coupling capacitance, area, and delay.  相似文献   

9.
在深亚微米设计中,降低能耗和传播延迟是片上全局总线所面对的两个最主要设计目标.本文提出了一种用于片上全局总线的时空编码方案,它既提高了性能又降低了峰值能耗和平均能耗.该编码方案利用空间总线倒相编码和时间编码电路技术的优点,在消除相邻连线上反相翻转的同时,减少了自翻转数和耦合翻转数.在应用该总线编码技术降低总线延时和能耗的设计中,给出了一种总线上插入中继驱动器的设计方法,以确定它们合适的尺寸和插入位置,使得在满足目标延时和翻转斜率要求的同时总线总的能耗最小.该方法可用来为各种编码技术获得翻转斜率约束下的总线能耗与延时的优化折中.  相似文献   

10.
在深亚微米设计中,降低能耗和传播延迟是片上全局总线所面对的两个最主要设计目标.本文提出了一种用于片上全局总线的时空编码方案,它既提高了性能又降低了峰值能耗和平均能耗.该编码方案利用空间总线倒相编码和时间编码电路技术的优点,在消除相邻连线上反相翻转的同时,减少了自翻转数和耦合翻转数.在应用该总线编码技术降低总线延时和能耗的设计中,给出了一种总线上插入中继驱动器的设计方法,以确定它们合适的尺寸和插入位置,使得在满足目标延时和翻转斜率要求的同时总线总的能耗最小.该方法可用来为各种编码技术获得翻转斜率约束下的总线能耗与延时的优化折中.  相似文献   

11.
Signal propagation delay on a multi-source multi-sink bidirectional bus has a dominant effect on high-performance chips. This work presents a novel greedy algorithm that minimizes the critical propagation delay of an RLC-based bus. Based on the topology of a multi-source multi-sink bus and the RLC delay model, the proposed algorithm inserts signal repeaters into the critical path of the RLC-based bus and adjusts their sizes to minimize the maximal propagation delay. This procedure is repeated until no additional improvement is needed. Several buses with various topologies are tested using the proposed algorithm in deep submicron technologies. Experimentally, the critical delay in an RLC-based bus can be reduced dramatically by up to 62.4% with inserted repeater sizes of 24 and execution time of 1.65 s on average. Moreover, average delay reduction, repeater sizes, and running time for 0.18 μm technology are 5.8%, 6.4%, and 26.2%, respectively, better than those of 0.35 μm. Additionally, the topologies of all of the RLC-based buses with inserted repeaters in deep submicron technologies are simulated using HSPICE. The error ratio in the critical delay of a bus with inserted repeaters determined by comparison with HSPICE is 2.7% on average. The proposed algorithm is simple and extremely practical.  相似文献   

12.
Capacitive crosstalk between adjacent signal wires has significant effect on performance and delay uncertainty of point-to-point on-chip buses in deep submicrometer (DSM) VLSI technologies. We propose a hybrid polarity repeater insertion technique that combines inverting and non-inverting repeater insertion to achieve constant average effective coupling capacitance per wire transition for all possible switching patterns. Theoretical analysis shows the superiority of the proposed method in terms of performance and delay uncertainty compared to conventional and staggered repeater insertion methods. Simulations at the 90-nm node on semi-global METAL5 layer show around 25% reduction in worst case delay and around 86% delay uncertainty minimization compared to standard bus with optimal repeater configuration. The reduction in worst case capacitive coupling reduces peak energy which is a critical factor for thermal regulation and packaging. Isodelay comparisons with standard bus show that the proposed technique achieves considerable reduction in total buffers area, which in turn reduces average energy and peak current. Comparisons with staggered repeater which is one of the simplest and most effective crosstalk reduction techniques in the literature show that hybrid polarity repeater offers higher performance, less delay uncertainty, and reduced sensitivity to repeater placement variation.   相似文献   

13.
Increased buffer insertion along on-chip global lines and growing amounts of leakage power have resulted in buffer-based leakage emerging as one of the chief contributors to system leakage power. In this paper, a bus system prototype is implemented in an industrial 65-nm SOI technology and measured results show up to a 45% reduction in total bus system power and an average reduction of 2.4$times$ in standby mode leakage power.   相似文献   

14.
Efficient RC low-power bus encoding methods for crosstalk reduction   总被引:1,自引:0,他引:1  
In on-chip buses, the RC crosstalk effect leads to serious problems, such as wire propagation delay and dynamic power dissipation. This paper presents two efficient bus-coding methods. The proposed methods simultaneously reduce more dynamic power dissipation and wire propagation delay than existing bus encoding methods. Our methods also reduce more total power consumption than other encoding methods. Simulation results show that the proposed method I reduces coupling activity by 26.7-38.2% and switching activity by 3.7%-7% on 8-bit to 32-bit data buses, respectively. The proposed method II reduces coupling activity by 27.5-39.1% and switching activity by 5.3-9% on 8-bit to 32-bit data buses, respectively. Both the proposed methods reduce dynamic power by 23.9-35.3% on 8-bit to 32-bit data buses and total propagation delay by up to 30.7-44.6% on 32-bit data buses, and eliminate the Type-4 coupling. Our methods also reduce total power consumption by 23.6-33.9%, 23.9-34.3%, and 24.1-34.6% on 8-bit to 32-bit data buses with the 0.18, 0.13, and 0.09 μm technologies, respectively.  相似文献   

15.
The effect of random signal lines on the on-chip inductance is quantitatively investigated, using an S-parameter-based methodology and a full wave solver, leading to an empirical model for high-frequency inductance. The results clearly indicate that the random signal lines as well as designated ground lines provide return paths for gigahertz-frequency signals. In particular, quasi TEM-wave-like propagation mode is observed above 10 GHz, revealing a unique relationship between capacitance and inductance of the signal line. Incorporating the random capacitive coupling effect, our frequency-dependent RLC model is confirmed to be valid up to 100 GHz.  相似文献   

16.
The device uses a standard NMOS one-transistor cell and is fabricated with a double polysilicon HMOS technology using polysilicon word lines and folded metal bit lines. Self-refresh is implemented with an on-chip timer, arbiter, and refresh counter. A high-speed arbiter resolves conflicts between refresh cycles and memory accesses. A `ready' output is provided to the processor. A multiplexed bus is provided in the array to carry column addresses and also I/O data paths. Another multiplexed bus is used for data lines between the input buffers, write buffers, secondary sense amplifiers, and output buffers. Redundant rows and columns are used for increased manufacturing yield. Polysilicon fuses are electrically programmed to select redundant elements.  相似文献   

17.
In this paper, we propose a compact on-chip interconnect model for full-chip simulation. The model consists of two components, a quasi-three-dimensional (3-D) capacitance model and an effective loop inductance model. In the capacitance model, we propose a novel concept of effective width (W/sub eff/) for a 3-D wire, which is derived from an analytical two-dimensional (2-D) model combined with a new analytical "wall-to-wall" model. The effective width provides a physics-based approach to decompose any 3-D structure into a series of 2-D segments, resulting in an efficient and accurate capacitance extraction. In the inductance model, we use an effective loop inductance approach for an analytic and hierarchical model construction. In particular, we show empirically that high-frequency signals (above multi-GHz) propagating through random signal lines can be approximated by a quasi-TEM mode relationship, leading to a simple way to extract the high-frequency inductance from the capacitance of the wire. Finally, the capacitance and inductance models are combined into a unified frequency-dependent RLC model, describing successfully the wide-band characteristics of on-chip interconnects up to 100 GHz. Non-orthogonal wire architecture is also investigated and included in the proposed model.  相似文献   

18.
In this paper, an analytical model for the current draw of an on-chip bus is presented. The model is combined with an on-chip power supply grid model in order to analyze noise caused by switching buses in a power supply grid. The bus is modeled as distributed resistance–inductance–capacitance (RLC) lines that are capacitively and inductively coupled to each other. Different switching patterns and driver skewing times are also included in the model. The power supply grid is modeled as a network of RLC segments. The model is verified by comparing it to HSPICE. The error was below 8%. The model is applied to determine the influence of driver skewing times on maximum power supply noise.   相似文献   

19.
A 4K/spl times/8 MOS dynamic RAM using a single transistor cell with on-chip self-refresh is described. The device uses a multiplexed address/data bus. Control of the reconfigurable data bus allows the RAM to operate on either an 8-bit or a 16-bit data bus. The memory cell is fabricated using a double polysilicon n-channel HMOS technology using polysilicon word lines and metal bit lines. Self-refresh is implemented with an on-chip timer, arbiter, counter and multiplexer. A high-speed arbiter resolves simultaneous memory and refresh requests. Redundant rows are used for increased manufacturing yields. Polysilicon fuses are electrically programmed to select redundant rows.  相似文献   

20.
In a parallel multiwire structure, the exact spacing and size of the wires determine both the resistance and the distribution of the capacitance between the ground plane and the adjacent signal carrying conductors, and have a direct effect on the delay. Using closed-form equations that map the geometry to the wire parasitics and empirical switch factor based delay models that show how repeaters can be optimized to compensate for dynamic effects, we devise a method of analysis for optimizing throughput over a given metal area. This analysis is used to show that there is a clear optimum configuration for the wires which maximizes the total bandwidth. Additionally, closed form equations are derived, the roots of which give close to optimal solutions. It is shown that for wide buses, the optimal wire width and spacing are independent of the total width of the bus, allowing easy optimization of on-chip buses. Our analysis and results are valid for lossy interconnects as are typical of wires in submicron technologies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号