期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Statistical skew modeling for general clock distribution networksin presence of process variations

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(5):704-717

Clock skew modeling is important in the performance evaluation and prediction of clock distribution networks. This paper addresses the problem of statistical skew modeling for general clock distribution networks in the presence of process variations. The only available statistical skew model is not suitable for modeling the clock skews of general clock distribution networks in which clock paths are not identical. The old model is also too conservative for estimating the clock skew of a well-balanced clock network that has identical but strongly correlated clock paths (for instance, a well-balanced H-tree). In order to provide a more accurate and more general statistical skew model for general clock distributions, we propose a new approach to estimating the mean values and variances of both clock skews and the maximal clock delay of general clock distribution networks. Based on the new approach, a closed-form model is also obtained for well-balanced H-tree clock distribution networks. The paths delay correlation caused by the overlapped parts of path lengths is considered in the new approach, so the mean values and the variances of both clock skews and the maximal clock delay are accurately estimated for general clock distribution networks. This enables an accurate estimate of yields of both clock skew and maximal clock delay to be made for a general clock distribution network 相似文献

2.

Simulation of interconnect inductive impact in the presence of process variations in 90 nm and beyond

Xiaoning Qi Gyure A. Yansheng Luo Lo S.C. Shahram M. Singhal K. 《Electron Device Letters, IEEE》2006,27(8):696-698

The on-chip inductive impact on signal integrity has been a problem for designs in deep-submicrometer technologies. The inductive impact increases the clock skew, max timing, and noise of bus signals. In this letter, circuit simulations using silicon-validated macromodels show that there is a significant inductive impact on the signal max timing (/spl sim/ 10% pushout versus RC delay) and noise (/spl sim/2/spl times/RC noise). In nanometer technologies, process variations have become a concern. Results show that device and interconnect process variations add /spl sim/ 3% to the RLC max-timing impact. However, their impact on the RLC signal noise is not appreciable. Finally, inductive impact in 65- and 45-nm technologies is investigated, which indicates that the inductance impact will not diminish as technology scales. 相似文献

3.

Over GHz low-power RF clock distribution for a multiprocessordigital system

Woonghwan Ryu Wai A.L.C. Fan Wei Wai Lai Lai Joungho Kim 《Advanced Packaging, IEEE Transactions on》2002,25(1):18-27

Conventional interconnections for digital clock distribution pose a severe power consumption problem for GHz clock distribution due to transmission line losses. Therefore, we have proposed an RF clock distribution (RCD) scheme for high-speed digital applications, in particular a multiprocessor system using global clocking. This paper first reports system power and signal integrity analysis results including skew, jitter, impedance mismatch, and noise for RF clock distribution,especially in the GHz range. Based on this analysis, a novel signal integrity design methodology for RF clock distribution systems is proposed. The clock skew created by process parameter variations are modeled and predicted. The system comprises a RF clock transmitter as a clock generator, an H-tree with junction couplers as a clock distributing network and a RF receiver as a digital clock-recovery module. Flip-chip interconnections for the chip-to-substrate assembly and 0.35 μm TSMC CMOS technology for the RF clock receiver are assumed. EMI analysis for 2 GHz 16-node-board-level RF clock distribution networks is conducted using 3D full-wave EM simulation. Finally, the RCD as a low power and high performance clocking method is demonstrated using HP's Advanced Design System (ADS) simulation, considering microwave frequency interconnection models and process parameter variations. In addition, test vehicles for both 2 GHz 16-node and 5 GHz 64-node board-level RF clock distribution networks were implemented and measured using thin, low-loss, and low permittivity Rogers^Lt; RO3003 high-frequency organic substrate 相似文献

4.

Design methodology for synthesizing clock distribution networksexploiting nonzero localized clock skew 总被引：1，自引：0，他引：1

Neves J.L. Friedman E.G. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1996,4(2):286-291

An integrated top-down design methodology is presented in this brief for synthesizing high performance clock distribution networks based on application dependent localized clock skew. The methodology is divided into four phases: (1) determining an optimal clock skew schedule composed of a set of nonzero clock skew values and the related minimum clock path delays; (2) designing the topology of the clock distribution network with delays assigned to each branch based on the circuit hierarchy, the aforementioned clock skew schedule, and minimizing process and environmental delay variations; (3) designing circuit structures to emulate the delay values assigned to the individual branches of the clock tree; and (4) designing the physical layout of the clock distribution network. The clock distribution network synthesis methodology is based on CMOS technology. The clock lines are transformed from distributed resistive capacitive interconnect lines into purely capacitive interconnect lines by partitioning the RC interconnect lines with inverting repeaters. Variations in process parameters are considered during the circuit design of the clock distribution network to guarantee a race-free circuit. Nominal errors of less than 2.5% for the delay of the clock paths and 7% for the clock skew between any two registers belonging to the same global data path as compared with SPICE Level-3 are demonstrated 相似文献

5.

Dynamically De-Skewable Clock Distribution Methodology

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(9):1220-1229

In a typical clock distribution scheme, a central clock signal is distributed to several sites on the integrated circuit (IC). Local regenerators at these sites buffer the clock signal for the logic in regions close to the regenerator. Minimizing the skew between the clocks at these regeneration sites is critical. In recent times, this is becoming harder due to increasing intra-die processing variations. In this paper, we describe a novel technique to distribute a clock signal from a central location to several sites on a VLSI IC. Our technique uses a buffered H-tree and includes circuitry to dynamically remove any skew that may result due to intra-die processing variations. While existing approaches to deskewing a clock tree have utilized several phase detection circuits (number of phase detectors dependent on the number of clock regenerators), our method requires only one phase detector. Also, in our approach, the resolution of the phase detector is inconsequential unlike existing techniques. Our deskewing technique can be applied dynamically, either at boot time or periodically during the operation of the IC. Using a six-level H-tree clock distribution network with process variations deliberately included, we demonstrate that our technique can reduce skews as high as 300 ps down to just 3 ps. We compare our clock tree with traditional buffered and unbuffered H-tree networks. 相似文献

6.

Comparisons of conventional, 3-D, optical, and RF interconnects for on-chip clock distribution

Kuan-Neng Chen Kobrinsky M.J. Barnett B.C. Reif R. 《Electron Devices, IEEE Transactions on》2004,51(2):233-239

This paper analyzes the performance of different interconnect technologies for on-chip clock distribution, including conventional, three-dimensional, optical, and radio frequency interconnects. Skew, power, and area usage were estimated for each of these technologies based on the 2001 International Technology Roadmap for Semiconductors. Our results indicate that most of the skew and power are associated with local clock distribution. Consequently, since the alternative clock distribution approaches that have been proposed focus on global clock distribution, we have not found significant advantages over conventional clock distribution in terms of skew and power. Furthermore, it was found that low skews could be attained with conventional clock distribution schemes if the clock signals are not scaled down. 相似文献

7.

Effect of parameter variations at chip and wafer level on clockskews

Sauter S. Schmitt-Landsiedel D. Thewes R. Weber W. 《Semiconductor Manufacturing, IEEE Transactions on》2000,13(4):395-400

In this paper, a methodology is proposed to determine clock skews and the performance of clock architectures considering parameter variations in an early stage of technology development. With this methodology, it is possible to separate process-induced clock skew from other effects like imperfect loading. Parameter variations are seen as one of the most important effects influencing chip performance in future. By comparing a 0.45- and a 0.25-μm technology, it is shown that in the future, process variations will increase clock skew. The clock skews are determined by measuring the relevant device and metal line parameters as a function of position over chip and wafer. In the past, parameters like IDS, Vth, and resistances could be measured very precisely, although it was difficult to measure low capacitances of single metal lines in the range of femto farad. Thus a new measurement method is used to determine interconnect capacitances extremely precisely. Based on these measurement data, a netlist of a defined clock tree is created by a C-program, and the clock signal delay is simulated. From the delay simulation, we calculate the clock skew for each chip dependent on the parameter variations. Experimental results are separated into a basic random fluctuation part and processing-related contributions on the chip and wafer levels. In addition, the effect of temperature gradients on each chip to the clock skew is simulated. The methodology presented is not restricted to just one clocktree but allows investigation of all kinds of clock distribution circuits. The method has clear advantages with respect to chip area against clocktree realizations on a testchip. No direct and costly measurement of signal delays by voltage contrast methods is required, since all parameters are determined by measurement on the device level 相似文献

8.

低偏斜、高变化容忍的层次化无缓冲谐振时钟分布网络

徐毅陈书明刘祥远《半导体学报》2011,32(9):095011-7

无缓冲谐振时钟分布网络能够最小化同步系统的时钟功耗。但由于没有缓冲器,时钟网络的偏斜受到多方面因素的影响,例如时钟互连线寄生参数的差异,非平衡时钟负载以及工艺、电压温度变化。本文提出了一种层次化的两相无缓冲谐振时钟互连网络结构,将网格型和树型结构的各自优点相结合。在TSMC 65nm标准CMOS工艺下,通过一个流水线乘法器电路分析了该结构时钟网络的偏斜及变化容忍特性。版图后仿真结果表明,层次化时钟网络的偏斜分别比纯网格和纯H树结构时钟网络降低超过75%和65%,而且在非平衡时钟负载或工艺、电压温度变化的情况下,时钟网络偏斜最高小于7ps,不超过整个时钟周期（约760ps）的1%。相似文献

9.

Device-Aware Yield-Centric Dual-V_t Design Under Parameter Variations in Nanoscale Technologies

Agarwal A. Kunhyuk Kang Bhunia S. Gallagher J.D. Roy K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(6):660-671

Dual-V_t design technique has proven to be extremely effective in reducing subthreshold leakage in both active and standby mode of operation of a circuit in submicrometer technologies. However, aggressive scaling of technology results in different leakage components (subthreshold, gate and junction tunneling) to become significant portion of total power dissipation in CMOS circuits. High-V_t devices are expected to have high junction tunneling current (due to stronger halo doping) compared to low-V_t devices, which in the worst case can increase the total leakage in dual-V_t design. Moreover, process parameter variations (and in turn V_t variations) are expected to be significantly high in sub-50-nm technology regime, which can severely affect the yield. In this paper, we propose a device aware simultaneous sizing and dual-V_t design methodology that considers each component of leakage and the impact of process variation (on both delay and leakage power) to minimize the total leakage while ensuring a target yield. Our results show that conventional dual-V_t design can overestimate leakage savings by 36% while incurring 17% average yield loss in 50-nm predictive technology. The proposed scheme results in 10%-20% extra leakage power savings compared to conventional dual-V_t design, while ensuring target yield. This paper also shows that nonscalability of the present way of realizing high-V_t devices results in negligible power savings beyond 25-nm technology. Hence, different dual-V_t process options, such as metal gate work function engineering, are required to realize high-performance and low-leakage dual-V_t designs in future technologies. 相似文献

10.

Clock generation and distribution for the first IA-64microprocessor 总被引：1，自引：0，他引：1

Tam S. Rusu S. Nagarji Desai U. Kim R. Ji Zhang Young I. 《Solid-State Circuits, IEEE Journal of》2000,35(11):1545-1552

The clock design for the first implementation of the IA-64 microprocessor is presented. A clock distribution with an active distributed deskewing technique is used to achieve a low skew of 28 ps. This technique is capable of compensating skews caused by within-die process variations that are becoming a significant factor of the clock design. The global, regional and local clock distributions are described. A multilevel skew budget and local clock timing methodology are used to enable a high-performance design by providing support for intentional clock skew injection and time borrowing. By providing a test access port interface to the deskew architecture and the incorporation of the on-die-clock-shrink, this design is equipped with two very powerful post-silicon timing debug tools that are critical to high-performance microprocessor design and enabled quick time-to-market 相似文献

11.

Timing–driven variation–aware synthesis of hybrid mesh/tree clock distribution networks

Ameer Abdelhadi Ran Ginosar Avinoam Kolodny Eby G. Friedman 《Integration, the VLSI Journal》2013

Clock skew variations adversely affect timing margins, limiting performance, reducing yield, and may also lead to functional faults. Non-tree clock distribution networks, such as meshes and crosslinks, are employed to reduce skew and also to mitigate skew variations. These networks, however, increase the dissipated power while consuming significant metal resources. Several methods have been proposed to trade off power and wires to reduce skew. In this paper, an efficient algorithm is presented to reduce clock skew variations while minimizing power dissipation and metal area overhead. With a combination of nonuniform meshes and unbuffered trees (UBT), a variation-tolerant hybrid clock distribution network is produced. Clock skew variations are selectively reduced based on circuit timing information generated by static timing analysis (STA). The skew variation reduction procedure is prioritized for critical timing paths, since these paths are more sensitive to skew variations. A framework for skew variation management is proposed. The algorithm has been implemented in a standard 65 nm cell library using standard EDA tools, and tested on several benchmark circuits. As compared to other nonuniform mesh construction methods that do not support managed skew tolerance, experimental results exhibit a 41% average reduction in metal area and a 43% average reduction in power dissipation. As compared to other methods that employ skew tolerance management techniques but do not use a hybrid clock topology, an 8% average reduction in metal area and a 9% average reduction in power dissipation are achieved. 相似文献

12.

A Three-Dimensional Stacked-Chip Star-Wiring Interconnection for a Digital Noise-Free and Low-Jitter I/O Clock Distribution Network

Ryu C. Chung D. Lee C. Kim J. Bae K. Yu J. Lee S. 《Microwave and Wireless Components Letters, IEEE》2006,16(12):651-653

Cascaded repeaters are indispensable circuit elements in conventional on-chip clock distribution networks due to heavy loss characteristics of on-chip global interconnections. However, cascaded repeaters cause significant jitter and skew problems in clock distribution networks when they are affected by power supply switching noise generated by digital logic blocks located on the same die. In this letter, we present a new three-dimensional (3-D) stacked-chip star-wiring interconnection scheme to make a clock distribution network free from both on-chip and package-level power supply noise coupling. The proposed clock distribution scheme provides an extremely low-jitter and low-skew clock signal by replacing the cascaded repeaters with lossless star-wiring interconnections on a 3-D stacked-chip package. We have demonstrated a 500-MHz input/output (I/O) clock delivery with 34-ps peak-to-peak jitter and a skew of 11ps, while a conventional I/O clock scheme exhibited a 146-ps peak-to-peak jitter and a 177-ps skew in the same power supply noise environment 相似文献

13.

All digital skew tolerant synchronous interfacing methods for high-performance point-to-point communications in deep sub-micron SoCs

Syed Rafay HasanAuthor Vitae Normand BélangerAuthor VitaeYvon SavariaAuthor Vitae M. Omair AhmadAuthor Vitae 《Integration, the VLSI Journal》2011,44(1):22-38

High-performance clocking of intellectual property (IP) modules, within a skew budget, is becoming difficult in deep sub-micron technologies. In this work, we propose a novel and all-digital synchronous design method for point-to-point communications, using two stages of interfacing registers and locally delayed clock with phase adjustments. This design is free from synchronizers and clock-data mismatch problems. Moreover, communicating modules run at frequencies which are virtually independent of the clock skew. We also provide a comprehensive case-wise mathematical analysis to facilitate design automation for synthesizing such designs as standard cells. An overall improvement in skew tolerance of up to n times (where n is the number of registers used), when compared to conventional designs, is achieved when the skew orientation is known and n/2 times if the skew orientation is unknown. Improvement in skew tolerance is validated using gate level simulations with the 0.18 μm TSMC CMOS technology. A prototype implementation of the proposed design using a Virtex-II Pro FPGA from Xilinx validates the claim that such designs allow a fast module to communicate with a slow module without constraining their frequencies. 相似文献

14.

Implementation of a thermal management unit for canceling temperature-dependent clock skew variations

A. Chakraborty Author VitaeAuthor Vitae A. Sathanur Author VitaeAuthor Vitae A. Macii Author Vitae Author Vitae M. Poncino Author Vitae 《Integration, the VLSI Journal》2008,41(1):2-8

Thermal gradients across the die are becoming increasingly prominent as we scale further down into the sub-nanometer regime. While temperature was never a primary concern, its non-negligible impact on delay and reliability is getting significant attention lately.One of the principal factors affecting designs today is timing criticality, which, in today's technologies is mostly determined by wire delays. Clocks, which are the backbone of the interconnect network, are extremely prone to temperature dependent delay variations and need to be designed with extreme care so as to meet accurate timing constraints. Their skew has to be minimized in order to guarantee functionality, albeit in the presence of these process variations.Temperature, on the other hand, is dynamic in nature and its effects hence need run-time monitoring and management. One of the most efficient ways to manage temperature dependent skew is through the use of buffers with dynamically tunable delays. The use of such buffers in the clock distribution network allows modulating the delay on selected branches of the clock network based on a thermal profile, so as to keep the skew within acceptable bounds.A runtime scheme obviously requires an on-line management unit. Our work predominantly focuses on the implementation of one such unit, while studying its impact on design parameters such as area, wire-length and power. Results show negligible a impact (0.67% in area, 0.62% in wire-length, 0.33% in power, and 0.37% in via-number) on the design. 相似文献

15.

A High-Frequency Clock Distribution Network Using Inductively Loaded Standing-Wave Oscillators

Sasaki M. 《Solid-State Circuits, IEEE Journal of》2009,44(10):2800-2807

The present paper introduces a resonant clock generation and distribution scheme that uses uniform amplitude and uniform phase standing wave oscillators in order to distribute a high-frequency clock signal with low skew, low jitter, and low power. A suitable distributed resonator for a global clock distribution that is inductively loaded transmission line generating a uniform amplitude and uniform phase standing wave is realized through detailed analysis of a standing wave on a loaded transmission line. A test chip is fabricated using 0.18-mum 6 M CMOS technology, and a cascaded distribution network is implemented for a global clock distribution with a space-filling curve. Furthermore, distributed local LC tanks are implemented as local resonant clock networks, which are composed of parasitic capacitors and small spiral inductors. The distributed local LC tanks are driven by a fine clock distributed with cascaded standing-wave oscillators and reduce the primary power in the clock distribution, which is dissipated as dynamic power in the parasitic capacitance of latches and/or flip flops. The measurement results reveal that, at 9.4 GHz, the peak-to-peak jitter is 5.2 ps and the clock skew is 0.8 ps, and the global and local distributions dissipated only 17% and 23% of CV² f power, respectively. 相似文献

16.

Technology limitations for N⁺/P⁺ polycidegate CMOS due to lateral dopant diffusion in silicide/polysilicon layers

Chu C.L. Chin G. Saraswat K.C. Wong S.S. Dutton R. 《Electron Device Letters, IEEE》1991,12(12):696-698

The device degradation of dual-polycide-gate N⁺/P⁺ CMOS polycide transistors due to the lateral diffusion of dopants in the silicides is studied using a coupled 2-D process and device simulator. Design rule spacings between the NMOS and the PMOS transistor are given for various NMOS:PMOS gate area ratios and thermal processing conditions. The simulations show that contrary to previous findings, micrometer and submicrometer spacings are possible for certain silicide technologies using low-temperature or short higher-temperature furnace steps. Simulations show that CoSi₂ and TiSi₂ appear to be better candidates for submicrometer dual-gate applications than WSi₂ 相似文献

17.

A clock interconnect extractor for multigigahertz frequencies incorporating inductance effect

Azadpour M.A. Kalkur T.S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):1143-1146

Due to decreasing device sizes and increasing clock speed, interconnect inductance is becoming an important factor in the on-chip delay analysis of deep submicrometer technologies. This delay has been represented as an RC model in the available electric design automation tools. In this paper, we model the on-chip interconnect as a RLC for systems running at multigigahertz frequencies. A static-extraction analysis method optimized for ASICs is detailed. It considers all the lines within the vicinity of the target signal line as return paths. 相似文献

18.

Clock distribution networks in synchronous digital integratedcircuits 总被引：1，自引：0，他引：1

Friedman E.G. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2001,89(5):665-692

Clock distribution networks synchronize the flow of data signals among synchronous data paths. The design of these networks can dramatically affect system-wide performance and reliability. A theoretical background of clock skew is provided in order to better understand how clock distribution networks interact with data paths. Minimum and maximum timing constraints are developed from the relative timing between the localized clock skew and the data paths. These constraint relationships are reviewed, and compensating design techniques are discussed. The field of clock distribution network design and analysis can be grouped into a number of subtopics: 1) circuit and layout techniques for structured custom digital integrated circuits; 2) the automated layout and synthesis of clock distribution networks with application to automated placement and routing of gate arrays, standard cells and larger block-oriented circuits; 3) the analysis and modeling of the timing characteristics of clock distribution networks; and 4) the scheduling of the optimal timing characteristics of clock distribution networks based on architectural and functional performance requirements. Each of these areas is described the clock distribution networks of specific industrial circuits are surveyed and future trends are discussed 相似文献

19.

Investigation Into the Scalability of Selectively Implanted Buried Subcollector (SIBS) for Submicrometer InP DHBTs

James Chingwei Li Yakov Royter Tahir Hussain Mary Y. Chen Charles H. Fields Rajesh D. Rajavel Steven S. Bui Binqiang Shi Donald A. Hitko David H. Chow Peter M. Asbeck Marko Sokolich 《Electron Devices, IEEE Transactions on》2007,54(3):398-409

Recent attempts to achieve 400 GHz or higher f_T and f _MAX with InP heterojunction bipolar transistors (HBTs) have resulted in aggressive scaling into the deep submicrometer regime. In order to alleviate some of the traditional mesa scaling rules, several groups have explored selectively implanted buried subcollectors (SIBS) as a means to decouple the intrinsic and extrinsic collector design. This allows tau_C to be minimized without incurring a large total C_BC increase, and hence, a net improvement in f_T and f_MAX is achieved. This paper represents the first investigation into the series resistance and capacitance characteristics of submicrometer-width SIBS regions (as narrow as 350 nm) for InP double HBTs. Although the SIBS resistance is higher than that of epitaxially grown layers, the SIBS concept is able to provide good dopant activation and a significant decrease in C_BC. S-parameter measurements are presented to clarify the impact of SIBS geometry variations, caused by both intentional device design and process variations, on f_T and f_MAX. Parasitic resistances and high background doping limit the f_T improvement, but the C_BC reduction is sufficient to demonstrate a 30% increase in f_MAX. Results indicate that further improvements in f_T and f_MAX using the SIBS concept will be possible 相似文献

20.

Design and Optimization of On-Chip Interconnects Using Wave-Pipelined Multiplexed Routing

Joshi A.J. Lopez G.G. Davis J.A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(9):990-1002

Every new VLSI technology generation has resulted in interconnects increasingly limiting the performance, area, and power dissipation of new processors. Subsequently, it is necessary to devise efficient interconnect design techniques to reduce the impact of VLSI interconnects on overall system design. New optimizations of a wave-pipelined multiplexed (WPM) interconnect routing circuit are described in this paper. These WPM circuits can be used with current interconnect repeater circuits to further reduce interconnect delay, interconnect area, transistor area, and/or power dissipation. For example, new area constrained WPM circuit optimizations illustrate that the interconnect circuit power can be reduced by 26% or the interconnect performance can be improved by 74%. Moreover, in both these cases, because a significant number of repeaters are eliminated, the transistor area can reduce by 41% or 29%, respectively. Finally, the tolerance of WPM circuits to crosstalk noise, power supply noise, clock skew, and manufacturing variations is also presented. This study of tolerance levels defines the conditions under which the WPM circuit will function correctly, and it is shown in this paper for the first time that WPM circuits are robust enough to operate with variability that can be encountered in deep submicrometer technologies. 相似文献