共查询到20条相似文献,搜索用时 31 毫秒
1.
Short-range parallel optical interconnect between integrated circuits can alleviate bandwidth, power, and packaging density issues that are associated with low-latency high-bandwidth input-output over electrical interconnect. In this paper, we evaluate the option of using true source-synchronous signaling over optical interconnect with a large number of channels, reducing the substantial per-channel clock synchronization circuitry to one instance. We also look into dc-unbalanced signaling to remove the need for data coding. Uniformity across channels is key to the feasibility of such an approach. An actual 64-channel parallel optical interconnect setup at 1.25 Gb/s/channel is examined, and models for the performance and uniformity of the different constituent parts of the interconnect are drawn up. Major attention is given to the statistical modeling of the coupling efficiency between a vertical cavity surface emitting laser array and a multifiber connector. Although derived in the context of a uniformity study, the stochastic models and the modeling approach are valuable in their own right. In our case study, the usage of a common logic threshold across all channels, which is required for dc-unbalanced signaling, appears infeasible after all models are combined. Efficient true source-synchronous signaling turns out to be within reach in carefully designed systems. 相似文献
2.
《Electron Devices, IEEE Transactions on》2009,56(9):1787-1798
3.
Jaussi J.E. Balamurugan G. Johnson D.R. Casper B. Martin A. Kennedy J. Shanbhag N. Mooney R. 《Solid-State Circuits, IEEE Journal of》2005,40(1):80-88
A source-synchronous I/O link with adaptive receiver-side equalization has been implemented in 0.13-/spl mu/m bulk CMOS technology. The transceiver is optimized for small area (360 /spl mu/m /spl times/ 360 /spl mu/m) and low power (280 mW). The analog equalizer is implemented as an 8-way interleaved, 4-tap discrete-time linear filter. The equalization improved the data rate of a 102 cm backplane interconnect by 110%. On-die adaptive logic determines optimal receiver settings through comparator offset cancellation, data alignment of the transmitter and receiver, clock de-skew and setting filter coefficients for equalization. The noise-margin degradation due to statistical variation in converged coefficient values was less than 3%. 相似文献
4.
This paper presents a comparative performance analysis to investigate the impact of aging mechanisms on various flip-flops in CMOS and FinFET technologies. We consider Bias Temperature Instability (BTI) and Hot Carrier Injection (HCI) effects on the robustness of high performance flip-flops. To apply BTI and HCI aging mechanisms, we utilize long-term model to estimate ∆ Vth and employ the updated Vth in transistor model file. The simulation results on performance analysis indicate the high ranking of various flip-flops considering speed and power consumption in each CMOS and FinFET technologies, moreover, approve the superiority of static FinFET flip-flops over CMOS flip-flops. In addition, a comparative analysis considering temperature and VDD variations over different FinFET flip-flop structures demonstrates the average percentages of TDQmin and PDP degradation against aging mechanisms are significantly less than similar CMOS flip-flops. 相似文献
5.
6.
《Solid-State Circuits, IEEE Journal of》2001,36(10):1565-1573
This paper presents the design of the ItaniumTM Processors system bus interface achieving a peak data bandwidth of 2.1 GB/s in a glueless four-way multiprocessing system. A source-synchronous data bus with differential strobes enables this high bandwidth. Topics covered in this paper include optimization technique for the system topology, CPU package, signaling protocol, and I/O circuits. Highly accurate modeling and validation methodologies enable a good correlation of experimental results with simulation data 相似文献
7.
Integration of partial scan and built-in self-test 总被引:2,自引:0,他引:2
Partial-Scan based Built-In Self-Test (PSBIST) is a versatile Design for Testability (DFT) scheme, which employs pseudo-random BIST at all levels of test to achieve fault coverages greater than 98% on average, and supports deterministic partial scan at the IC level to achieve nearly 100% fault coverage. PSBIST builds its BIST capability on top a partial scan structure by adding a test pattern generator, an output data compactor, and a PSBIST controller in a way similar to that of deriving a full scan BIST from a full scan structure. However, to make the scheme effective, there is a minimum requirement regarding which flip-flops in the circuit should be replaced by scan flip-flops and/or initialization flip-flops. In addition, test arents are usually added to boost the fault coverage to the range of 95 to 100 percent. These test points are selected based on a novel probabilistic testability measure, which can be computed extremely fast for a special class of circuits. This ciass of circuits is precisely the type of circuits that we obtain after replacing some of the flip-flops.withscan and/or initilization flip-flops. The testability measure is also used for a very useful quick estimation of the fault coverage right after the selection of sean flip-flops, even before the circuit is modified to incorporate PSBIST capability. While PSBIST provides all the benefits of BIST, it incurs lower area overhead and performance degradation than full scan. The area overhead is further reduced when the boundary scan cells are reconfigured for BIST usage. 相似文献
8.
《IEEE transactions on circuits and systems. I, Regular papers》2006,53(9):1928-1933
This paper proposes a bus architecture which improves the performance and/or power dissipation of online buses. The proposed architecture reduces the delay on alternate lines by lowering the threshold voltage of its devices. Furthermore, the shifting of the signal switching on adjacent lines reduces the worst case coupling capacitance. Two implementations of this bus architecture are proposed, the alternate-$V_t$ and the alternate forward body biased schemes, and are compared to a conventional bus scheme. For a flop distance of 1800$mu$ m, the proposed schemes use the gained delay slack to reduce the total device width, and thus reducing the energy dissipation by 31.2%. For a 500-ps cycle time, the proposed bus schemes increase the maximum distance between flip-flops by 33%. 相似文献
9.
10.
Chuan Lin Hai Zhou 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(12):1340-1348
In system-on-chips (SOCs), a nonnegligible part of operation time is spent on global wires with long delays. Retiming-that is moving flip-flops in a circuit without changing its functionality-can be explored to pipeline long interconnect wires in SOC designs. The problem of retiming over a netlist of macro-blocks, where the internal structures may not be changed and flip-flops may not be inserted on some wire segments is called the wire retiming problem. In this paper, we formulate the constraints of the wire retiming problem as a fixpoint computation and use an iterative algorithm to solve it. Experimental results show that this approach is multiple orders more efficient than the previous one. 相似文献
11.
Ghoneima M. Ismail Y. Khellah M.M. Tschanz J. De V. 《IEEE transactions on circuits and systems. I, Regular papers》2009,56(9):2020-2032
As technology scales, the shrinking wire width increases the interconnect resistivity, while the decreasing interconnect spacing significantly increases the coupling capacitance. This paper proposes reducing the number of bus lines of the conventional parallel-line bus (PLB) architecture by multiplexing each m-bits onto a single line. This bus architecture, the serial-link bus (SLB), transforms an n-bit conventional PLB into an n/m-line (serial link) bus. The advantage of SLBs is that they have fewer lines, and if the bus width is kept the same, SLBs will have a larger line pitch. Increasing the line width has a twofold reduction effect on the line resistance; as the resistivity of sub-100 nm wires drops significantly, the line width increases. Also, increasing the line width and spacing reduces the coupling capacitance between adjacent lines, but increases the line-to-ground capacitance. Thus, an optimum degree of multiplexing m opt and an optimum width to pitch ratio etaopt exist, which minimizes the bus energy dissipation and maximizes the bus throughput per unit area. The optimum degree of multiplexing and optimum width-to-pitch ratio for maximum throughput per unit area and minimum energy dissipation for the 25-130-nm technologies was determined in this paper. Also, an encoding technique was proposed and implemented to reduce the switch activity penalty due to serialization. HSPICE simulations show that for the same throughput per unit area as conventional parallel-line data buses, the SLB architecture reduces the energy dissipation by up to 31% for a 64-bit bus implemented in an intermediate metal layer of a 50-nm technology, and a reduction of 53% is projected for a 25-nm technology. 相似文献
12.
Nor Muzlifah Mahyuddin 《Microelectronics Journal》2011,42(9):1039-1048
Market forces are continually demanding devices with increased functionality/unit area; these demands have been satisfied through aggressive technology scaling which, unfortunately, has impacted adversely on the global interconnect delay subsequently reducing system performance. Line drivers have been used to mitigate the problems with delay; however, these have large power consumption. A solution to reducing the power dissipation of the drivers is to use lower supply voltages. However, by adopting a lower power supply voltage, the performance of the line drivers for global interconnects is impaired unless low-swing signalling techniques are implemented. The paper describes the design of a low-swing signalling scheme which consists of a low-swing driver, called the nLVSD driver which is an improved version of the MJ-driver [1] designed by Juan A. Montiel-Nelson and Jose C. Garcia. Subsequently, both low-swing driver schemes are analysed and compared focusing on their power consumption and performance characteristics, which are the main issues in present day IC design. A comparison between the two driver schemes showed that the nLVSD driver exhibited a 34% improvement regarding power consumption and a 28% improvement in delay when driving a 10 mm length of interconnect. A comparison between the two schemes was also undertaken in the presence of ±3σ Process and Voltage (PV) variations. The analysis indicated that the nLVSD driver scheme was more robust than the MJ-driver with a 33% and 44% improvement with respect to power consumption and delay variations. In order to further improve the robustness of the nLVSD scheme against process variation, the scheme was further analysed to identify which process variables had the most impact on circuit delay and power consumption. For completeness the effects of process variation on interconnect delay and power consumption was also undertaken. 相似文献
13.
Hong-Yean Hsieh Wentai Liu Paul Franzon Ralph Cavin III 《The Journal of VLSI Signal Processing》1997,16(2-3):131-147
System performance can be improved by employing scheduled skews at flip-flops. This optimization technique is called skewed-clock optimization and has been successfully used in memory designs to achieve high operating frequencies. There are two important issues in developing this optimization technique. The first is the selection of appropriate clock skews to improve system performance. The second is to reliably distribute skewed clocks in the presence of manufacturing and environmental variations. Without the careful selection of clocking times and control of unintentional clock skews, potential system performance might not be achieved. In this paper a theoretical framework is first presented for solving the problem of optimally scheduling skews. A novel self-calibrating clock distribution scheme is then developed which can automatically track variations and minimize unintentional skews. Clocks with proper skews can be reliably delivered by such a scheme. 相似文献
14.
15.
Zhigang Hao Sheldon X.-D. Tan E. Tlelo-Cuautle Jacob Relles Chao Hu Wenjian Yu Yici Cai Guoyong Shi 《Analog Integrated Circuits and Signal Processing》2012,73(1):3-11
In this paper, we present a novel method for statistical inductance extraction and modeling for interconnects considering process variations. The new method, called statHenry, is based on the collocation-based spectral stochastic method where orthogonal polynomials are used to represent the statistical processes. The coefficients of the partial inductance orthogonal polynomial are computed via the collocation method where a fast multi-dimensional Gaussian quadrature method is applied with sparse grids. To further improve the efficiency of the proposed method, a random variable reduction scheme is used. Given the interconnect wire variation parameters, the resulting method can derive the parameterized closed form of the inductance value. We show that both partial and loop inductance variations can be significant given the width and height variations. This new approach can work with any existing inductance extraction tool to extract the variational partial and loop inductance or impedance. Experimental results show that our method is orders of magnitude faster than the Monte Carlo method for several practical interconnect structures. 相似文献
16.
《Electron Device Letters, IEEE》2009,30(1):14-17
17.
Mensink E. Schinkel D. Klumperink E. A. M. van Tuijl E. Nauta B. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(4):438-446
Crosstalk limits the achievable data rate of global on-chip interconnects on large CMOS ICs. This is especially the case, if low-swing signaling is used to reduce power consumption. Differential interconnects provide a solution for most crosstalk and noise sources, but not for neighbor-to-neighbor crosstalk in a data bus. This neighbor-to-neighbor crosstalk can be reduced with twists in the differential interconnect pairs. To reduce via resistance and metal layer use, we use as few twists as possible by placing only one twist in every even interconnect pair and only two twists in every odd interconnect pair. Analysis shows that there are optimal positions for the twists, which depend on the termination impedances of the interconnects. Theory and measurements on a 10-mm-long bus in 0.13-mum CMOS show that only one twist at 50% of the even interconnect pairs, two twists at 30% and 70% of the odd interconnect pairs, and both a low-ohmic source and a low-ohmic load impedance are very effective in mitigating the crosstalk 相似文献
18.
19.
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(2):306-311
20.
MicroBlaze核是嵌入在Xilinx FPGA之中的属于32位RISC Harvard架构软处理器核。针对Xilinx MicroBlaze软处理器的核间互连,实现多处理器核之间的快速通信的目的,采用了PLB和FSL总线混连的方法,利用xps_mail-box和xps_mutex核完成核间的通信与同步,通过在Xilinx EDK平台下,将3个软处理器核嵌入到FPGA Spartan-3E芯片上的试验,开发出了一个运行在FPGA上的基于多处理器的嵌入式可编程片上系统,得出此种多核处理器混连的可行性与实用性,核间通信速度得到提升的结论。 相似文献