共查询到20条相似文献,搜索用时 125 毫秒
1.
Chung-Chieh KuoAuthor VitaeChia-Chun TsaiAuthor Vitae Trong-Yen LeeAuthor Vitae 《Integration, the VLSI Journal》2011,44(1):87-101
As IC fabrication technologies get into nanometer era, clock routing gradually dominates chip performance indicated by delay, cost, and power consumption. X-architecture can be applied for routing metal wires in diagonal and rectilinear directions to overcome the above challenges due to wirelength reduction. In this paper, we present a clock routing algorithm, called PMXF, to construct an X-architecture zero-skew clock tree with minimum delay. An X-pattern library is defined for simplifying the merging procedure of the DME approach, an X-Flip technique is proposed for reducing the wirelength between the paired points, and a wire sizing technique is applied for achieving zero skew. In terms of clock delay, wirelength, power consumption, and via count listed in the experimental results on benchmarks, the proposed PMXF algorithm can respectively achieve more reductions compared with other previous X-architecture clock routing algorithms. 相似文献
2.
3.
4.
5.
Chia-Chun TsaiAuthor Vitae Chung-Chieh KuoAuthor VitaeFeng-Tzu HsuAuthor Vitae Trong-Yen LeeAuthor Vitae 《Integration, the VLSI Journal》2012,45(1):76-90
Antenna effect is a phenomenon in the plasma-based nanometer process and directly influences the manufacturing yield of VLSI circuits. Because antenna-critical metal wires have sufficient charges to damage the thin gate oxides of the clock input ports connected by a clock tree, the standard cells or IPs cannot be driven by the clock source synchronously. For a given X-architecture clock tree that connects n clock sinks, we consider the antenna effect in the clock tree and propose a discharge-path-based antenna effect detection method. To fix the antenna violations, we use the jumper insertion technique recommended by foundries. Furthermore, we integrate the layer assignment technique to reduce the inserted jumper and via counts. Differing from the existing works, the delay of vias is considered in delay calculation, and a wire sizing technique is applied for clock skew compensation after fixing the antenna violations. Experimental results on benchmarks show that our algorithm runs in O(n2) to averagely insert 48.21% less jumpers and reduce 20.35% in vias compared with other previous algorithms. Moreover, the SPICE simulation further verifies the correctness of the resulting clock tree. 相似文献
6.
Jun-Dong Cho Sarrafzadeh M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(1):84-98
We propose a new approach for optimizing clock trees, especially for high-speed circuits. Our approach provides a useful guideline to a designer, by user-specified parameters, and three of these tradeoffs are provided in this paper. (1) First, to provide a “good” tradeoff between skew and wire length, a new clock tree routing scheme is proposed. The technique is based on a combination of hierarchical bottom-up geometric matching and minimum rectilinear Steiner tree. Our experiments complement the theoretical results. (2) For high-speed clock distribution in the transmission line mode (e.g., multichip modules) where interconnection delay dominates the clock delay, buffer congestion might exist in a layout. Using many buffers in a small wiring area results in substantial interline crosstalks as well as wirability, when the elongation of the imbalanced subtrees is necessary. Placing buffers evenly (locally or globally) over the plane at the minimum impact on wire length increase helps avoid buffer congestion and results in less crosstalk between clock wires. Thus, an effective technique for buffer distribution is proposed. Experimental results verify the effectiveness of the proposed algorithms. (3) Finally, a postprocessing step constraining on phase-delay is also proposed. The technique is based on a combination of hierarchical bottom-up geometric matching and bounded radius minimum spanning tree. The proposed algorithm has an important application in MCM clock net synthesis as well as VLSI clock net synthesis 相似文献
7.
多级时钟树构造是解决时钟布线问题的关键。本文提出一种新的层次式布线策略,它将拓扑生成,绕障碍DME及BUFFER定位同时进行考虑,避免了布线的盲目性,减少了后处理工作。首先,对时钟汇点进行层次式均匀划分,在各个局域区域同时进行时钟子树的拓扑生成和DME嵌入; 相似文献
8.
Useful-Skew Clock Routing with Gate Sizing for Low Power Design 总被引:2,自引:0,他引:2
This paper presents a new problem formulation and algorithm of clock routing combined with gate sizing for minimizing total logic and clock power. Instead of zero-skew or assuming a fixed skew bound, we seek to produce useful skews in clock routing. This is motivated by the fact that only positive skew should be minimized while negative skew is useful in that it allows a timing budget larger than the clock period for gate sizing. We construct an useful-skew tree (UST) such that the total clock and logic power (measured as a cost function) is minimized. Given a required clock period and feasible gate sizes, a set of negative and positive skew bounds are generated. The allowable skews within these bounds and feasible gate sizes together form the feasible solution space of our problem. Inspired by the Deferred-Merge Embedding (DME) approach, we devise a merging segment perturbation procedure to explore various tree configurations which result in correct clock operation under the required period. Because of the large number of feasible configurations, we adopt a simulated annealing approach to avoid being trapped in a local optimal configuration. This is complemented by a bi-partitioning heuristic to generate an appropriate connection topology to take advantage of useful skews. Experimental results of our method have shown 12% to 20% total power reduction over previous methods of clock routing with zero-skew or a single fixed skew bound and separately sizing logic gates. This is achieved at no sacrifice of clock frequency. 相似文献
9.
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2010,18(1):131-141
10.
针对时钟布线提出了一种有效的变线宽算法。该算法通过对时钟树中各树枝延迟敏感度的分析,选择总体最优的连线进行变线宽处理,使得时钟树的路径延迟最小化。在延迟优化后,为了使时钟偏差小于给定的约束,通过变线宽对各种钟汇点的延迟进行全面的再分配,使延迟最大的时钟汇点延迟最小化,而延迟较小的路径延迟适当增加,以进一步改善时钟树延迟。实验结果表明,该算法有较高的运行效率,时钟树的路径路径和时钟偏差得到了显著的改 相似文献
11.
Mario R. Casu Mariagrazia Graziano Guido Masera Gianluca Piccinini Maurizio Zamboni 《Microelectronics Journal》2003,34(12):1175-1185
In this paper a coupled electro-thermal model is used for the optimal design of the clock distribution tree of a high performance microprocessor. Such approach allows simultaneously to take into account both thermal and electrical constraints. In particular timing issues such as clock delay from the root of the tree to the leaves and skew between the leaves are optimized by a suitable wire and buffer sizing. At the same time the lifetime constraints of clock wires that are affected by the electromigration, enhanced by the high temperature reached in interconnects due to the Joule self-heating, are checked and respected. 相似文献
12.
Jonggab Kil Jie Gu Kim C.H. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(4):456-465
This paper describes an interconnect technique for subthreshold circuits to improve global wire delay and reduce the delay variation due to process-voltage-temperature (PVT) fluctuations. By internally boosting the gate voltage of the driver transistors, operating region is shifted from subthreshold region to super-threshold region enhancing performance and improving tolerance to PVT variations. Simulations of a clock distribution network using the proposed driver shows a 66%-76% reduction in 3sigma clock skew value and 84%-88% reduction in clock tree delay compared to using conventional drivers. A 0.4-V test chip has been fabricated in a 0.18-mum 6-metal CMOS process to demonstrate the effectiveness of the proposed scheme. Measurement results show 2.6times faster switching speed and 2.4times less delay sensitivity under temperature variations. 相似文献
13.
Qing Zhu Wayne Wei-Ming Dai 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1996,4(2):210-226
A new concept of chip and package co-design for the clock network is presented in this paper. We propose a two level clock distribution scheme which partitions the clock network into two levels. First, the clock terminals are partitioned into a set of clusters. For each cluster, a local on-chip clock tree is used to distribute the clock signal from a locally inserted buffer to terminals inside this cluster. The clock signal is then distributed from the main clock driver to each of local buffers by means of a global clock tree, which is a planar tree with equal path lengths. With the flip chip area I/O attachment, the planar global clock tree can be put on a dedicated package layer. The interconnect on the package layer has two to four order smaller resistance than that on the chip layer. The main contribution of this paper is a novel algorithm to construct a planar clock tree with equal path lengths-the length of the path from the clock source to each destination is exactly the same. In addition, the path length from the source to destinations is minimized 相似文献
14.
15.
A novel clock distribution scheme is proposed for high-speed DRAM to minimise clock-skew among data buffers. It has ideally zero-skew characteristic by employing folded clock lines and phase blending circuits. Simulation results show that the maximum clock-skew between two receivers located 4 mm apart is less than 20 ps, regardless of process, voltage, and temperature variations 相似文献
16.
《Electronics letters》2001,37(13):813-814
Minimum delay associated with the hold time requirement is a concern to circuit designers, since race-through hazards are inherent to any multiple clock organisation or clock distribution tree irrespective of clock frequency. The monotonic property of domino logic aggravates that minimum-delay path failure through coupling induced speedup. To tackle the minimum-delay problem for domino logic, we propose a minimum-delay optimisation algorithm considering coupling effects. Experimental results indicate that our algorithm fields a significant increase of minimum-delay without incurring maximum-delay violation 相似文献
17.
Discusses the scaling rules for VLSI that pertain to the total wire length and the clock speed. The analysis indicates that the total wire length is not increasing as rapidly as standard scaling theory would indicate. This results from over-scaling of the cell size reduction from one generation to the next (as predicted by Moore [1975]). However, the total wire length is still increasing at a rate that will cause significant power dissipation in the interconnects and indicates the need for new locally interconnected architectures. Moreover, the over-scaling of cell size reduction also raises the possible limitations that arise as the cell size is reduced faster than the gate length. We also discussed the effects of scaling on on-die clock speed. While gate-array clock speeds are scaling slower than the scaling rules would predict (a problem for large multi-chip architectures), clock speeds in modern VLSI chips track the scaling rule quite accurately 相似文献
18.
The obstacle-avoiding rectilinear Steiner minimal tree (OARSMT) problem is a hot topic in very-large-scale integration physical design. In practice, most of the obstacles occupy the device layer and certain lower metal layers. Therefore, we can place wires on top of the obstacles. To maximize routing resources over obstacles, we propose a heuristic for constructing a rectilinear Steiner tree with slew constraints. Our algorithm adopts an extended rectilinear full Steiner tree grid as the routing graph. We mark two types of Steiner point candidates, which are used for constructing Steiner trees and refining solutions. A shortest path heuristic variant is designed for constructing Steiner trees and it takes into account slew constraint by inhibiting growth. Furthermore, we use a pre-computed strategy to avoid calculating slew rate repeatedly. Experimental results show that our algorithm maximizes routing resources over obstacles and saves routing resources outside obstacles. Compared with the conventional OARSMT algorithm, our algorithm reduces the wire length outside obstacles by as much as 18.74% and total wire length by as much as 6.03%. Our algorithm improves the latest related algorithm by approximately 2% in terms of wire length within a reasonable running time. Additionally, calculating the slew rate only accounts for approximately 15% of the total runing time. 相似文献
19.
Tai-Chen Chen Song-Ra Pan Yao-Wen Chang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(1):28-41
As the operating frequency increases to gigahertz and the rise time of a signal is less than or comparable to the time-of-flight delay of a wire, it is necessary to consider the transmission line behavior for delay computation. We present in this paper, an analytical formula for the delay computation under the transmission line model. Extensive simulations with SPICE show the high fidelity of the formula. Compared with previous works, our model leads to smaller average errors in delay estimation. Based on this formula, we show the property that the minimum delay for a transmission line with reflection occurs when the number of round trips is minimized (i.e., equals one). Besides, we show that the delay of a circuit path is a posynomial function in wire and buffer sizes, implying that a local optimum is equal to the global optimum. Thus, we can apply any efficient search algorithm such as the well-known gradient search procedure to compute the globally optimal solution. Experimental results show that simultaneous wire and buffer sizing is very effective for performance optimization under the transmission line model. 相似文献