期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Pattern-matching-based X-architecture zero-skew clock tree construction with X-Flip technique and via delay consideration

Chung-Chieh KuoAuthor VitaeChia-Chun TsaiAuthor Vitae Trong-Yen LeeAuthor Vitae 《Integration, the VLSI Journal》2011,44(1):87-101

As IC fabrication technologies get into nanometer era, clock routing gradually dominates chip performance indicated by delay, cost, and power consumption. X-architecture can be applied for routing metal wires in diagonal and rectilinear directions to overcome the above challenges due to wirelength reduction. In this paper, we present a clock routing algorithm, called PMXF, to construct an X-architecture zero-skew clock tree with minimum delay. An X-pattern library is defined for simplifying the merging procedure of the DME approach, an X-Flip technique is proposed for reducing the wirelength between the paired points, and a wire sizing technique is applied for achieving zero skew. In terms of clock delay, wirelength, power consumption, and via count listed in the experimental results on benchmarks, the proposed PMXF algorithm can respectively achieve more reductions compared with other previous X-architecture clock routing algorithms. 相似文献

2.

合理偏差驱动的时钟线网构造及优化 总被引：1，自引：0，他引：1

赵萌蔡懿慈洪先龙刘毅《半导体学报》2003,24(4):438-444

提出了一种新的时钟布线算法 ,它综合了 top- down和 bottom- up两种时钟树拓扑产生方法 ,以最小时钟延时和总线长为目标 ,并把合理偏差应用到时钟树的构造中 .电路测试结果证明 ,与零偏差算法比较 ,该算法有效地减小了时钟树的总体线长 ,并且优化了时钟树的性能相似文献

3.

一种非零偏差时钟网布线算法

孙骥毛军发李晓春《微电子学》2005,35(3):293-296

特定的非零偏差时钟网比零偏差时钟网更具优势,它有助于提高时钟频率、降低偏差的敏感度.文章提出了一种新的非零偏差时钟树布线算法,它结合时钟节点延时和时钟汇点位置,得到一个最大节点延时次序合并策略,使时钟树连线长度变小.实验结果显示,这种算法与典型的最邻近选择合并策略相比较,可以减少20%～30%的总连线长度. 相似文献

4.

带偏差约束的时钟线网的拓扑构造和优化 总被引：1，自引：0，他引：1

刘毅洪先龙蔡懿慈《半导体学报》2002,23(11):1228-1232

提出了一种新的拓扑构造和优化方法,综合考虑了几种拓扑构造方法的优点,总体考虑偏差约束,局部进行线长优化.实验结果表明,它可以有效控制节点之间的偏差,同时保证减小时钟布线树的整体线长. 相似文献

5.

Discharge-path-based antenna effect detection and fixing for X-architecture clock tree

Chia-Chun TsaiAuthor Vitae Chung-Chieh KuoAuthor VitaeFeng-Tzu HsuAuthor Vitae Trong-Yen LeeAuthor Vitae 《Integration, the VLSI Journal》2012,45(1):76-90

Antenna effect is a phenomenon in the plasma-based nanometer process and directly influences the manufacturing yield of VLSI circuits. Because antenna-critical metal wires have sufficient charges to damage the thin gate oxides of the clock input ports connected by a clock tree, the standard cells or IPs cannot be driven by the clock source synchronously. For a given X-architecture clock tree that connects n clock sinks, we consider the antenna effect in the clock tree and propose a discharge-path-based antenna effect detection method. To fix the antenna violations, we use the jumper insertion technique recommended by foundries. Furthermore, we integrate the layer assignment technique to reduce the inserted jumper and via counts. Differing from the existing works, the delay of vias is considered in delay calculation, and a wire sizing technique is applied for clock skew compensation after fixing the antenna violations. Experimental results on benchmarks show that our algorithm runs in O(n²) to averagely insert 48.21% less jumpers and reduce 20.35% in vias compared with other previous algorithms. Moreover, the SPICE simulation further verifies the correctness of the resulting clock tree. 相似文献

6.

A buffer distribution algorithm for high-performance clock netoptimization 总被引：1，自引：0，他引：1

Jun-Dong Cho Sarrafzadeh M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(1):84-98

We propose a new approach for optimizing clock trees, especially for high-speed circuits. Our approach provides a useful guideline to a designer, by user-specified parameters, and three of these tradeoffs are provided in this paper. (1) First, to provide a “good” tradeoff between skew and wire length, a new clock tree routing scheme is proposed. The technique is based on a combination of hierarchical bottom-up geometric matching and minimum rectilinear Steiner tree. Our experiments complement the theoretical results. (2) For high-speed clock distribution in the transmission line mode (e.g., multichip modules) where interconnection delay dominates the clock delay, buffer congestion might exist in a layout. Using many buffers in a small wiring area results in substantial interline crosstalks as well as wirability, when the elongation of the imbalanced subtrees is necessary. Placing buffers evenly (locally or globally) over the plane at the minimum impact on wire length increase helps avoid buffer congestion and results in less crosstalk between clock wires. Thus, an effective technique for buffer distribution is proposed. Experimental results verify the effectiveness of the proposed algorithms. (3) Finally, a postprocessing step constraining on phase-delay is also proposed. The technique is based on a combination of hierarchical bottom-up geometric matching and bounded radius minimum spanning tree. The proposed algorithm has an important application in MCM clock net synthesis as well as VLSI clock net synthesis 相似文献

7.

一种层次式加载BUFFER的时钟网布线

李芝燕姚丽红《电路与系统学报》1999,4(2):23-29

多级时钟树构造是解决时钟布线问题的关键。本文提出一种新的层次式布线策略,它将拓扑生成,绕障碍ＤＭＥ及ＢＵＦＦＥＲ定位同时进行考虑,避免了布线的盲目性,减少了后处理工作。首先,对时钟汇点进行层次式均匀划分,在各个局域区域同时进行时钟子树的拓扑生成和ＤＭＥ嵌入; 相似文献

8.

Useful-Skew Clock Routing with Gate Sizing for Low Power Design 总被引：2，自引：0，他引：2

Joe Gufeng Xi Wayne Wei-Ming Dai 《The Journal of VLSI Signal Processing》1997,16(2-3):163-179

This paper presents a new problem formulation and algorithm of clock routing combined with gate sizing for minimizing total logic and clock power. Instead of zero-skew or assuming a fixed skew bound, we seek to produce useful skews in clock routing. This is motivated by the fact that only positive skew should be minimized while negative skew is useful in that it allows a timing budget larger than the clock period for gate sizing. We construct an useful-skew tree (UST) such that the total clock and logic power (measured as a cost function) is minimized. Given a required clock period and feasible gate sizes, a set of negative and positive skew bounds are generated. The allowable skews within these bounds and feasible gate sizes together form the feasible solution space of our problem. Inspired by the Deferred-Merge Embedding (DME) approach, we devise a merging segment perturbation procedure to explore various tree configurations which result in correct clock operation under the required period. Because of the large number of feasible configurations, we adopt a simulated annealing approach to avoid being trapped in a local optimal configuration. This is complemented by a bi-partitioning heuristic to generate an appropriate connection topology to take advantage of useful skews. Experimental results of our method have shown 12% to 20% total power reduction over previous methods of clock routing with zero-skew or a single fixed skew bound and separately sizing logic gates. This is achieved at no sacrifice of clock frequency. 相似文献

9.

Combinatorial Algorithms for Fast Clock Mesh Optimization

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2010,18(1):131-141

Clock mesh has been widely used to distribute the clock signal across the chip. Clock mesh is driven by a top-level tree and a set of mesh buffers. We present fast and efficient combinatorial algorithms to simultaneously identify the candidate locations as well as sizes of the buffers driving the clock mesh. We show that such a sizing offers a better solution than inserting buffers of uniform size across the mesh. Due to the high redundancy, a mesh architecture offers high tolerance toward variations in clock skew. However, such a redundancy comes at the expense of mesh wire length and power dissipation. Based on survivable network theory, we formulate the problem to reduce the clock mesh by retaining only those edges that are critical to maintain redundancy. Such a formulation offers designer the option to tradeoff between power and tolerance to process variations. We present efficient postprocessing techniques to reduce the size of the mesh buffers after mesh reduction. Experimental results indicate that our techniques can result in power savings up to 28% with less than 3.3% delay penalty. We also present driver models that can help in simulating the clock mesh. Such models achieve near-HSPICE accuracy with significant speedup in run time. 相似文献

10.

一种有效的变线宽时钟布线算法

李芝燕严晓浪《微电子学》1999,29(3):164-168

针对时钟布线提出了一种有效的变线宽算法。该算法通过对时钟树中各树枝延迟敏感度的分析,选择总体最优的连线进行变线宽处理,使得时钟树的路径延迟最小化。在延迟优化后,为了使时钟偏差小于给定的约束,通过变线宽对各种钟汇点的延迟进行全面的再分配,使延迟最大的时钟汇点延迟最小化,而延迟较小的路径延迟适当增加,以进一步改善时钟树延迟。实验结果表明,该算法有较高的运行效率,时钟树的路径路径和时钟偏差得到了显著的改相似文献

11.

Coupled electro-thermal modeling and optimization of clock networks

Mario R. Casu Mariagrazia Graziano Guido Masera Gianluca Piccinini Maurizio Zamboni 《Microelectronics Journal》2003,34(12):1175-1185

In this paper a coupled electro-thermal model is used for the optimal design of the clock distribution tree of a high performance microprocessor. Such approach allows simultaneously to take into account both thermal and electrical constraints. In particular timing issues such as clock delay from the root of the tree to the leaves and skew between the leaves are optimized by a suitable wire and buffer sizing. At the same time the lifetime constraints of clock wires that are affected by the electromigration, enhanced by the high temperature reached in interconnects due to the Joule self-heating, are checked and respected. 相似文献

12.

A High-Speed Variation-Tolerant Interconnect Technique for Sub-Threshold Circuits Using Capacitive Boosting

Jonggab Kil Jie Gu Kim C.H. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(4):456-465

This paper describes an interconnect technique for subthreshold circuits to improve global wire delay and reduce the delay variation due to process-voltage-temperature (PVT) fluctuations. By internally boosting the gate voltage of the driver transistors, operating region is shifted from subthreshold region to super-threshold region enhancing performance and improving tolerance to PVT variations. Simulations of a clock distribution network using the proposed driver shows a 66%-76% reduction in 3sigma clock skew value and 84%-88% reduction in clock tree delay compared to using conventional drivers. A 0.4-V test chip has been fabricated in a 0.18-mum 6-metal CMOS process to demonstrate the effectiveness of the proposed scheme. Measurement results show 2.6times faster switching speed and 2.4times less delay sensitivity under temperature variations. 相似文献

13.

Planar clock routing for high performance chip and packageco-design

Qing Zhu Wayne Wei-Ming Dai 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1996,4(2):210-226

A new concept of chip and package co-design for the clock network is presented in this paper. We propose a two level clock distribution scheme which partitions the clock network into two levels. First, the clock terminals are partitioned into a set of clusters. For each cluster, a local on-chip clock tree is used to distribute the clock signal from a locally inserted buffer to terminals inside this cluster. The clock signal is then distributed from the main clock driver to each of local buffers by means of a global clock tree, which is a planar tree with equal path lengths. With the flip chip area I/O attachment, the planar global clock tree can be put on a dedicated package layer. The interconnect on the package layer has two to four order smaller resistance than that on the chip layer. The main contribution of this paper is a novel algorithm to construct a planar clock tree with equal path lengths-the length of the path from the clock source to each destination is exactly the same. In addition, the path length from the source to destinations is minimized 相似文献

14.

高速多级时钟网布线 总被引：4，自引：4，他引：0

李芝燕严晓浪《半导体学报》2000,21(3):290-297

提出了一种新的加载缓冲器的时钟布线算法 .该算法根据时钟汇点的分布情况 ,在时钟布线之前对缓冲器进行预先布局 ,并将时钟树的拓扑生成及实体嵌入和层次式的缓冲器布局方法有机结合起来 ,使布线情况充分反映缓冲器对时钟网结构的影响 .实验证明 ,与将缓冲器插入和布局作为后处理步骤相比 ,缓冲器预先插入和布局在很大程度上避免了布线的盲目性 ,并能更加有效地实现各时钟子树的延迟和负载的平衡 . 相似文献

15.

Clock distribution scheme for high-speed DRAM

Kook J. Wee J.-K. Moon G. Lee S. 《Electronics letters》2002,38(13):626-627

A novel clock distribution scheme is proposed for high-speed DRAM to minimise clock-skew among data buffers. It has ideally zero-skew characteristic by employing folded clock lines and phase blending circuits. Simulation results show that the maximum clock-skew between two receivers located 4 mm apart is less than 20 ps, regardless of process, voltage, and temperature variations 相似文献

16.

Coupling-aware minimum delay optimisation for domino logic circuits

《Electronics letters》2001,37(13):813-814

Minimum delay associated with the hold time requirement is a concern to circuit designers, since race-through hazards are inherent to any multiple clock organisation or clock distribution tree irrespective of clock frequency. The monotonic property of domino logic aggravates that minimum-delay path failure through coupling induced speedup. To tackle the minimum-delay problem for domino logic, we propose a minimum-delay optimisation algorithm considering coupling effects. Experimental results indicate that our algorithm fields a significant increase of minimum-delay without incurring maximum-delay violation 相似文献

17.

Scaling theory in modern VLSI

Ferry D.K. Akers L.A. 《Circuits and Devices Magazine, IEEE》1997,13(5):41-44

Discusses the scaling rules for VLSI that pertain to the total wire length and the clock speed. The analysis indicates that the total wire length is not increasing as rapidly as standard scaling theory would indicate. This results from over-scaling of the cell size reduction from one generation to the next (as predicted by Moore [1975]). However, the total wire length is still increasing at a rate that will cause significant power dissipation in the interconnects and indicates the need for new locally interconnected architectures. Moreover, the over-scaling of cell size reduction also raises the possible limitations that arise as the cell size is reduced faster than the gate length. We also discussed the effects of scaling on on-die clock speed. While gate-array clock speeds are scaling slower than the scaling rules would predict (a problem for large multi-chip architectures), clock speeds in modern VLSI chips track the scaling rule quite accurately 相似文献

18.

A heuristic for constructing a rectilinear Steiner tree by reusing routing resources over obstacles

《Integration, the VLSI Journal》2016

The obstacle-avoiding rectilinear Steiner minimal tree (OARSMT) problem is a hot topic in very-large-scale integration physical design. In practice, most of the obstacles occupy the device layer and certain lower metal layers. Therefore, we can place wires on top of the obstacles. To maximize routing resources over obstacles, we propose a heuristic for constructing a rectilinear Steiner tree with slew constraints. Our algorithm adopts an extended rectilinear full Steiner tree grid as the routing graph. We mark two types of Steiner point candidates, which are used for constructing Steiner trees and refining solutions. A shortest path heuristic variant is designed for constructing Steiner trees and it takes into account slew constraint by inhibiting growth. Furthermore, we use a pre-computed strategy to avoid calculating slew rate repeatedly. Experimental results show that our algorithm maximizes routing resources over obstacles and saves routing resources outside obstacles. Compared with the conventional OARSMT algorithm, our algorithm reduces the wire length outside obstacles by as much as 18.74% and total wire length by as much as 6.03%. Our algorithm improves the latest related algorithm by approximately 2% in terms of wire length within a reasonable running time. Additionally, calculating the slew rate only accounts for approximately 15% of the total runing time. 相似文献

19.

Timing modeling and optimization under the transmission line model

Tai-Chen Chen Song-Ra Pan Yao-Wen Chang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(1):28-41

As the operating frequency increases to gigahertz and the rise time of a signal is less than or comparable to the time-of-flight delay of a wire, it is necessary to consider the transmission line behavior for delay computation. We present in this paper, an analytical formula for the delay computation under the transmission line model. Extensive simulations with SPICE show the high fidelity of the formula. Compared with previous works, our model leads to smaller average errors in delay estimation. Based on this formula, we show the property that the minimum delay for a transmission line with reflection occurs when the number of round trips is minimized (i.e., equals one). Besides, we show that the delay of a circuit path is a posynomial function in wire and buffer sizes, implying that a local optimum is equal to the global optimum. Thus, we can apply any efficient search algorithm such as the well-known gradient search procedure to compute the globally optimal solution. Experimental results show that simultaneous wire and buffer sizing is very effective for performance optimization under the transmission line model. 相似文献

20.

用模拟退火算法实现集成电路热布局优化 总被引：4，自引：0，他引：4

王乃龙戴宏宇周润德《半导体学报》2003,24(4):427-432

介绍了一种综合考虑集成电路电学性能指标以及热效应影响的布局优化方法 .在保证传统设计目标 (如芯片面积、连线长度、延迟等 )不被恶化的基础上 ,通过降低或消除芯片上的热点来优化集成电路芯片的温度分布情况 ,进而优化整个电路性能 .并将改进的模拟退火算法应用于集成电路的热布局优化 ,模拟结果表明该方法与传统布局方法相比在保持了较好的延迟与连线长度等设计目标的同时 ,很好地改善了芯片表面的热分配情况相似文献