首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
2.
This paper describes a new leakage current reduction methodology that can give a statistical leakage current reduction even if the chip is in active mode, as well as in sleep mode. The proposed scheme utilizes a time locality of activation probability of a given circuit block like cache memory characteristics. The leakage cut-off switch is operated by a self-timed sleep timer, which puts the block into sleep mode. By waiting for a certain number of cycles before entering sleep mode, power overhead associated with the sleep and wake-up process is optimized, and its conditional probability is also analyzed. The effectiveness of the proposed scheme is verified by an 8-bit RISC microprocessor using Verilog HDL with real firmware, and demonstrated by a 64-bit carry-look-ahead adder with the self-cut-off switch fabricated with dual-threshold voltage SOI technology. The criterion of the effectiveness of the proposed scheme is also discussed.  相似文献   

3.
A 135K transistor, uniformly pipelined 50-MHz CMOS 64-bit floating-point arithmetic processor chip is described. The execution unit is capable of sustaining pipelined performance of one 32-bit or 64-bit result every 20 ns for all operations except double-precision multiply (40 ns) and divide. The chip employs an exponent difference prediction scheme and a unified leading-one and sticky-bit computation logic for the addition and subtraction operations. A hardware multiplier using a radix-8 modified Booth algorithm and a divider using a radix-2 SRT algorithm are employed.<>  相似文献   

4.
A truly modular and power-scalable architecture for low-power programmable frequency dividers is presented. The architecture was used in the realization of a family of low-power fully programmable divider circuits, which consists of a 17-bit UHF divider, an 18-bit L-band divider, and a 12-bit reference divider. Key circuits of the architecture are 2/3 divider cells, which share the same logic and the same circuit implementation. The current consumption of each cell can be determined with a simple power optimization procedure. The implementation of the 2/3 divider cells is presented, the power optimization procedure is described, and the input amplifiers are briefly discussed. The circuits were processed in a standard 0.35 μm bulk CMOS technology, and work with a nominal supply voltage of 2.2 V. The power efficiency of the UHF divider is 0.77 GHz/mW, and of the L-band divider, 0.57 GHz/mW. The measured input sensitivity is >10 mV rms for the UHF divider, and >20 mV rms for the L-band divider  相似文献   

5.
This paper describes the design and hardware results of a scannable pulse-to-static conversion register array for self-timed circuits. The circuits include a self-timed control circuit and a 64-bit register array, both designed utilizing self-resetting CMOS (SRCMOS) circuit techniques. The self-timed feature of the control block allows it to require only one system clock input. The evaluation, reset, and write-enable controls are all generated within the control macro. The register array is a level-sensitive scan design, which is compatible and complies with SRCMOS test modes. This type of register array can facilitate the synchronous/asynchronous interfaces, pipelined operation, power management, and testing of advanced digital systems employing a mixture of static and dynamic circuits to achieve low power and high performance  相似文献   

6.
A byte-slice datapath for exploring multi-chip RISC processor development in AlGaAs-GaAs heterojunction bipolar transistor (HBT) technology has been designed, fabricated and tested. The circuits are implemented using differential current-mode logic (CML) and emitter-coupled logic (ECL) with signal swings of 250 mV. Each datapath chip contains a single slice, including an 8-bit by 32-word single-port register file with a 230-ps read access time, and an 8-bit carry-select adder with a 140-ps select path and a 380-ps ripple-carry path. Each unpackaged die was tested using an at-speed boundary scan test scheme. The register file and adder carry chain are also implemented in a special test chip for accurate performance characterization of these critical circuits  相似文献   

7.
Multiplier and divider circuits are usually required in the fields of analog signal processing and parallel-computing neural or fuzzy systems. In particular, this paper focuses on the hardware implementation of fuzzy controllers, where the divider circuit is usually the bottleneck. Multiplier/divider circuits can be implemented with a combination of A/D-D/A converters. An efficient design based on current-mode data converters is presented herein. Continuous-time algorithmic converters are chosen to reduce the control circuitry and to obtain a modular design based on a cascade of bit cells. Several circuit structures to implement these cells are presented and discussed. The one that is selected enables a better trade-off speed/power than others previously reported in the literature while maintaining a low area occupation. The resulting multiplier/divider circuit offers a low voltage operation, provides the division result in both analog and digital formats, and it is suitable for applications of low or middle resolution (up to 9 bits) like applications to fuzzy controllers. The analysis is illustrated with Hspice simulations and experimental results from a CMOS multiplier/divider prototype with 5-bit resolution. Experimental results from a CMOS current-mode fuzzy controller chip that contains the proposed design are also included.  相似文献   

8.
This paper describes an investigation of potential advantages and pitfalls of applying an asynchronous design methodology to an advanced microprocessor architecture. A prototype complex instruction set length decoding and steering unit was implemented using self-timed circuits. [The Revolving Asynchronous Pentium(R) Processor Instruction Decoder (RAPPID) design implemented the complete Pentium II(R) 32-bit MMX instruction set.] The prototype chip was fabricated on a 0.25 μm CMOS process and tested successfully. Results show significant advantages - in particular, performance of 2.5-4.5 instructions per nanosecond - with manageable risks using this design technology. The prototype achieves three times the throughput and half the latency, dissipating only half the power and requiring about the same area as the fastest commercial 400 MHz clocked circuit fabricated on the same process  相似文献   

9.
The parasitic bipolar leakage and the large subthreshold leakage due to high floating-body voltage reduce the noise margin and increase the delay of the circuits in the partially depleted silicon-on-insulator (PD/SOI). Differential cascode voltage switch logic (DCVSL) has circuit topologies susceptible to the leakage currents. In this paper, we propose a new circuit style to effectively handle the leakage problems in PD/SOI DCVSL. The proposed low-swing DCVSL (LS-DCVSL) uses the small internal swing to prevent the body of evaluation transistors from being charged to high voltage and, hence, suppress the leakages in DCVSL. Simulation results show that the proposed LS-DCVSL five-input XOR circuit is 33% faster than DCVSL five-input XOR circuit. In addition, the proposed circuit does not experience noise margin reduction due to pass-gate leakage.  相似文献   

10.
Usually, efficient self-timed adders are realized using the dynamic differential cascode voltage switch logic. This allows the end-completion to be easily detected, but it makes circuit design and testing very complex, compelling the production of full-custom layouts and leading to a very long time before marketing. This paper presents a new 56-bit high-speed self-timed adder realized with conventional AMS 0.35 μm CMOS standard cells. The proposed circuit uses overlapped execution circuits, which exploit the initialization time that always elapses between two consecutive addition operations. Compared to several self-timed adders existing in the literature, the addition circuit proposed here shows brilliant advantages in terms of speed-performance, silicon area occupancy and power dissipation.  相似文献   

11.
Asynchronous or self-timed systems that do not rely on a global clock to keep system components synchronized can offer significant advantages over traditional clocked circuits in a variety of applications. In order to ease the complexity of this style of design, however, suitable self-timed circuit primitives must be available to the system designer. This article describes a technique for building self-timed circuits and systems using a library of circuit primitives implemented using Actel field programmable gate arrays (FPGAs). The library modules use a two-phase transition signaling protocol for control signals and a bundled protocol for data signals. A first-in first-out (FIFO) buffer and a simple routing chip are presented as examples of building self-timed circuits using FPGAs.This work was supported in part by NSF award MIP-9111793.  相似文献   

12.
This work presents a reconfigurable mixed-signal system-on-chip (SoC), which integrates switched-capacitor-based field programmable analog arrays (FPAA), analog-to-digital converter (ADC), digital-to-analog converter, digital down converter, digital up converter, 32-bit reduced instruction-set computer central processing unit (CPU) and other digital IPs on a single chip with 0.18 μm CMOS technology. The FPAA intellectual property could be reconfigured as different function circuits, such as gain amplifier, divider, sine generator, and so on. This single-chip integrated mixed-signal system is a complete modern signal processing system, occupying a die area of 7×8 mm2 and consuming 719 mW with a clock frequency of 150 MHz for CPU and 200 MHz for ADC/DAC. This SoC chip can help customers to shorten design cycles, save board area, reduce the system power consumption and depress the system integration risk, which would afford a big prospect of application for wireless communication.  相似文献   

13.
A single chip silicon gate LSI device is described which interfaces a microprocessor to a capacitive keyboard. The LSI circuit replaces a large number of MSI devices and substitutes a unique digital key detection scheme for the more traditional differential amplifiers. On chip are an internal oscillator, a clock generator, key matrix scanning and detection circuits, a digital filter, input sensing circuits with hysteresis, multiple key rollover electronics, and handshaking logic to interface asynchronously with any standard 8-bit microprocessor such as the 8080. Off chip is only an RC network for the purpose of setting the oscillator frequency.  相似文献   

14.
A 1-V WLAN IEEE 802.11a CMOS transceiver integrates all building blocks on a single chip including a transformer-feedback VCO and a stacked divider for the frequency synthesizer and 8-bit IQ ADCs and 8-bit IQ DACs. Fabricated in a 0.18-mum CMOS process and operated at a single 1-V supply, the receiver and the transmitter consume 85.7 mW and 53.2 mW, including the frequency synthesizer, respectively. The total chip area with pads is 12.5 mm2.  相似文献   

15.
This work presents a design flow for asynchronous, self-timed dual-rail circuits which introduces a timing assumption in the return-to-spacer phase. The design flow enables power proportionality and is demonstrated through the design of a 32-bit ripple-carry adder and a 32-bit comparator for internet of things applications. The designs are synthesized to a 65 nm cell library with state-of-the-art transistor sizing for subthreshold. Simulation results show improved performance and energy per computation across operating conditions compared with single-rail equivalents. The design flow allows extension of the power proportional philosophy to a wider range of circuits.  相似文献   

16.
This paper presents a mixed-signal programmable chip for high-speed vision applications. It consists of an array of processing elements, arranged to operate in accordance with the principles of single instruction multiple data (SIMD) computing architectures. This chip, implemented in a 0.35-/spl mu/m fully digital CMOS technology, contains /spl sim/ 3.75 M transistors and exhibits peak performance figures of 330 GOPS (8-bit equivalent giga-operations per second), 3.6 GOPS/mm/sup 2/ and 82.5 GOPS/W. It includes structures for image acquisition and for image processing, meaning that it does not require a separate imager for operation. At the sensory side, integration and log-compression sensing circuits are embedded, thus allowing the chip to handle a large variety of illumination conditions. At the processing plane, analog and digital circuits are employed whose parameters can be programmed and their architecture reconfigured for the realization of software-coded processing algorithms. The chip provides, and accepts, 8-bit digitized data through a 32-bit bidirectional data bus which operates at 120 MB/s. Experimental results show that frame rates of 1000 frames per second (FPS) can be achieved under room illumination conditions; applications using exposures of about 50 /spl mu/s have been recently reached by using special illumination setups. The chip can capture an image, run approximately 150 two-dimensional linear convolutions, and download the result in 8-bit digital format, in less than 1 ms. This feature, together with the possibility of executing sequences of user-definable instructions (stored on a full-custom 32-kb on-chip memory), and storing intermediate results (up to 8 grayscale images) makes the chip a true general-purpose sensory/processing device.  相似文献   

17.
This paper deals with a new approach to the design of high-performance asynchronous pipelined datapaths. A novel methodology to implement the self-timed stages of a data-path is demonstrated. It is based on the use of both static and dynamic CMOS modules. The former act as overlapped execution circuits and anticipate their computation with respect to the dynamic blocks. An appropriate four-phase protocol able to orchestrate the proposed architecture and a new efficient handshake circuit are described. The above method, applied to a 32-bit addition stage, allows a performance gain to be obtained of up to about 40% and a reduction in power dissipation of about 33%, with a reasonable area overhead compared with conventional designs.  相似文献   

18.
A 32×32-bit multiplier using multiple-valued current-mode circuits has been fabricated in 2-μm CMOS technology. For the multiplier based on the radix-4 signed-digit number system, 32×32-bit two's complement multiplication can be performed with only three-stage signed-digit full adders using a binary-tree addition scheme. The chip contains about 23600 transistors and the effective multiplier size is about 3.2×5.2 mm2, which is half that of the corresponding binary CMOS multiplier. The multiply time is less than 59 ns. The performance is considered comparable to that of the fastest binary multiplier reported  相似文献   

19.
Recent research has demonstrated that for certain types of applications like sampled audio systems, self-timed circuits can achieve very low power consumption, because unused circuit parts automatically turn into a stand-by mode. Additional savings may be obtained by combining the self-timed circuits with a mechanism that adaptively adjusts the supply voltage to the smallest possible, while maintaining the performance requirements. This paper describes such a mechanism, analyzes the possible power savings, and presents a demonstrator chip that has been fabricated and tested. The idea of voltage scaling has been used previously in synchronous circuits, and the contributions of the present paper are: 1) the combination of supply scaling and self-timed circuitry which has some unique advantages, and 2) the thorough analysis of the power savings that are possible using this technique  相似文献   

20.
We present a scalable high-speed divide-by-N frequency divider using only basic digital CMOS circuits. The divider achieves high-speed operation using a novel parallel counter and a pipelined architecture. The parallel counter is based on a state look-ahead component in conjunction with an internal pipeline structure in order to simultaneously trigger all state value updates without a rippling effect. The pipeline latencies are precluded due to the use of a subtractor circuit that “swallows” any additional cycles. Furthermore, our frequency divider is easily scalable to large divider widths due to the use of modular component architecture. The fan-in and fan-out are independent of the divider width, thus making the structure attractive for regular VLSI implementation and continued technology scaling. We implemented our proposed divider using a 0.15-μm TSMC digital cell library and achieved a maximum operating frequency of 2 GHz, an area of 112 848 μm2 (900 transistors), and consumed 15.47 mW of power operating at 2 GHz for an 8-bit design, which offers 252 different frequency divisions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号