首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
In most low-power VLSI designs, the supply voltage is usually reduced to lower the total power consumption. However, the device speed will be degraded as the supply voltage goes down. In this paper, we propose new algorithmic-level techniques to compensate the increased delays based on the multirate approach. We apply the technique of polyphase decomposition to design low-power transform coding architectures, in which the transform coefficients are computed through decimated low-speed input sequences. Since the operating frequency is M-times slower than the original design while the system throughput rate is still maintained, the speed penalty can be compensated at the architectural level. We start with the design of low-power multirate discrete cosine transform (DCT)/inverse discrete cosine transform (IDCT) VLSI architectures. Then the multirate low-power design is extended to the modulated lapped transform (MLT), extended lapped transform (ELT), and a unified low-power transform coding architecture. Finally, we perform finite-precision analysis for the multirate DCT architectures. The analytical results can help us to choose the optimal wordlength for each DCT channel under required signal-to-noise ratio (SNR) constraint, which can further reduce the power consumption at the circuit level. The proposed multirate architectures can also be applied to very high-speed block discrete transforms in which only low-speed operators are required  相似文献   

3.
HMOS-CMOS, a new high-performance bulk CMOS technology, is described. This technology builds on HMOS II, and features high resistivity p-substrate, diffused n-well and scaled n- and p-channel devices of 2-/spl mu/m channel length and 400-/spl Aring/ gate oxide thickness. The aggressive scaling of n and p devices results in 350-ps minimum gate delay and 0.04-pJ power delay product. HMOS-CMOS is a single poly technology suitable for microprocessor and static RAM applications. A 4K static RAM test vehicle is described featuring fully CMOS six-transistor memory cell, a chip size of 19600 mil/SUP 2/, 75 /spl mu/W standby power, data retention down to a V/SUB cc/ voltage of 1.5 V and a minimum chip select and address access time of 25 ns.  相似文献   

4.
This paper presents a new approach for energy reduction and speed improvement of multiport SRAMs. The key idea is to use current-mode for both read and write operations. To toggle a memory cell, a very small voltage swing is first created on the high-capacitive bit lines. This voltage is then translated into a differential current being injected into the cell, which in turn allows complementary potential to be developed on the cell nodes. As compared to the conventional write approach, SPICE simulations using a 0.35-μm CMOS process have shown 2.8 to 9.9× in energy savings and 1.02 to 6.36× reduction in delay, for memory sizes of 32 to 1 K words. We also present a current-mode sense-amplifier that operates in a similar fashion as the write circuit. The design and implementation of a pipelined 32×64 three-port register file utilizing the proposed technique is described. Measurements of the register file chip have confirmed the feasibility of the approach  相似文献   

5.
郑吉君 《信息技术》2011,(5):84-86,90
基于运动估计的全搜索算法,人们提出了有预处理的脉动阵列硬件结构,相较于无功耗处理的原始脉动阵列减少了12.3%。在此基础上,进一步根据逻辑产生门控时钟以及停止寄存器链对电路进行管理,取得了69.2%的功耗节省。  相似文献   

6.
NEDA: a low-power high-performance DCT architecture   总被引:4,自引:0,他引:4  
Conventional distributed arithmetic (DA) is popular in application-specific integrated circuit (ASIC) design, and it features on-chip ROM to achieve high speed and regularity. In this paper, a new DA architecture called NEDA is proposed, aimed at reducing the cost metrics of power and area while maintaining high speed and accuracy in digital signal processing (DSP) applications. Mathematical analysis proves that DA can implement inner product of vectors in the form of two's complement numbers using only additions, followed by a small number of shifts at the final stage. Comparative studies show that NEDA outperforms widely used approaches such as multiply/accumulate (MAC) and DA in many aspects. Being a high-speed architecture free of ROM, multiplication, and subtraction, NEDA can also expose the redundancy existing in the adder array consisting of entries of 0 and 1. A hardware compression scheme is introduced to generate a butterfly structure with minimum number of additions. NEDA-based architectures for 8 /spl times/ 8 discrete cosine transform (DCT) core are presented as an example. Savings exceeding 88% are achieved, when the compression scheme is applied along with NEDA. Finite word-length simulations demonstrate the viability and excellent performance of NEDA.  相似文献   

7.
In this paper we present circuit techniques for CMOS low-power high-performance multiplier design. Novel full adder circuits were simulated and fabricated using 0.8-μm CMOS (in BiCMOS) technology. The complementary pass-transistor logic-transmission gate (CPL-TG) full adder implementation provided an energy savings of 50% compared to the conventional CMOS full adder. CPL implementation of the Booth encoder provided 30% power savings at 15% speed improvement compared to the static CMOS implementation. Although the circuits were optimized for (16×16)-b multiplier using the Booth algorithm, a (6×6)-b implementation was used as a test vehicle in order to reduce simulation time. For the (6×6)-b case, implementation based on CPL-TG resulted in 18% power savings and 30% speed improvement over conventional CMOS  相似文献   

8.
We present a new approach to the design of high-performance low-power linear filters. We use p-channel synapse transistors as analog memory cells, and mixed-signal circuits for fast low-power arithmetic. To demonstrate the effectiveness of our approach, we have built a 16-tap 7-b 200-MHz mixed-signal finite-impulse response (FIR) filter that consumes 3 mW at 3.3 V. The filter uses synapse pFETs to store the analog tap coefficients, electron tunneling and hot-electron injection to modify the coefficient values, digital registers for the delay line, and multiplying digital-to-analog converters to multiply the digital delay-line values with the analog tap coefficients. The measured maximum clock speed is 225 MHz; the measured tap-multiplier resolution is 7 b at 200 MHz. The total die area is 0.13 mm2. We can readily scale our design to longer delay lines  相似文献   

9.
A novel CMOS integrated circuit for a batteryless transponder system is presented. Batteryless transponders require contactless transmission of both the information and power between a mobile data carrier and a stationary or handheld reader unit. The operating principle of this system gives a superior performance in reading distance due to separation of the powering and data transmission phases-compared to systems with continuous powering and damping modulation. This paper describes the function of the transponder IC and the circuit design techniques used for the various building blocks  相似文献   

10.
Vertical integration offers numerous advantages over conventional structures. By stacking multiple-material layers to form double gate transistors and by stacking multiple device layers to form multidevice-layer integration, vertical integration can emerge as the technology of choice for low-power and high-performance integration. In this paper, we demonstrate that the vertical integration can achieve better circuit performance and power dissipation due to improved device characteristics and reduced interconnect complexity and delay. The structures of vertically integrated double gate (DG) silicon-on-insulator (SOI) devices and circuits, and corresponding multidevice-layer (3-D) SOI circuits are presented; a general double-gate SOI model is provided for the study of symmetric and asymmetric SOI CMOS circuits; circuit speed, power dissipation of double-gate dynamic threshold (DGDT) SOI circuits are investigated and compared to single gate (SG) SOI circuits; potential 3-D SOI circuits are laid out. Chip area, layout complexity, process cost, and impact on circuit performance are studied. Results show that DGDT SOI CMOS circuits provide the best power-delay product, which makes them very attractive for low-voltage low-power applications. Multidevice-layer integration achieves performance improvement by shortening the interconnects. Results indicate that up to 40% of interconnect performance improvements can be expected for a 4-device-layer integration.  相似文献   

11.
A CMOS fingerprint sensor architecture with embedded cellular logic for image processing is presented. The system senses a fingerprint image with a capacitive technique and performs several image-processing algorithms, including thinning the ridges of the fingerprint structure and encoding it to its characteristic features. Image processing is achieved by application of hexagonal local operators implemented in pixel-parallel mixed neuron-MOS/CMOS logic circuits. The massive parallelism of the architecture leads to a very low power dissipation. Results of simulations and measurements on a demonstrator chip in 0.65-μm double-poly standard CMOS technology are shown. The approach is well suited for person-identification applications, especially in small and low-cost portable systems, such as smart cards  相似文献   

12.
In this paper, a wide tuning range, low power CMOS automatic gain control (AGC) with a simple architecture is proposed. The proposed AGC is composed of variable gain amplifier (VGA), comparator and charge pump, and the dB-linear gain is controlled by charge pump. The AGC was implemented in a 0.18um CMOS technology. The dynamic range of the VGA is more than 55dB, the bandwidth is 30MHz and the gain error lower than ±1.5dB over the full temperature and gain ranges. It is designed for GPS application and is fed from a single 1.8V power supply. The AGC power consumption is less than 5mW and area of the AGC is 700*450um2.  相似文献   

13.
A wide tuning range, low power CMOS automatic gain control (AGC) with a simple architecture is proposed. The proposed AGC is composed of a variable gain amplifier (VGA), a comparator and a charge pump, and the dB-linear gain is controlled by the charge pump. The AGC was implemented in a 0.18 μm CMOS technology. The dynamic range of the VGA is more than 55 dB, the bandwidth is 30 MHz, and the gain error is lower than ±1.5 dB over the full temperature and gain ranges. It is designed for GPS application and is fed from a single 1.8 V power supply.The AGC power consumption is less than 5 mW, and the area of the AGC is 700 × 450 μm~2.  相似文献   

14.
In recent years, power dissipation along with silicon area has become the key figure in chip design. The increasing demands on system performance require high-performance digital signal processing (DSP) systems to include dedicated number-crunching units as individually optimized building blocks. The various design methodologies in use stress one of the following figures: power dissipation, throughput, or silicon area. This paper presents a design methodology reducing any combination of cost drivers subject to a specified throughput. As a basic principle, the underlying optimization regards the existing interactions within the design space of a building block. Crucial in such optimization is the proper dimensioning of device sizes in contrast to the common use of minimal dimensions in low-power implementations. Taking the design space of an FIR filter as an example, the different steps of the design process are highlighted resulting in a low-power high-throughput filter implementation. It is part of an industrial read-write channel chip for hard disks with a worst case throughput of 1.6 GSamples/s at 23 mW in a 0.13-/spl mu/m CMOS technology. This filter requires less silicon area than other state-of-the-art filter implementations, and it disrupts the average trend of power dissipation by a factor of 6.  相似文献   

15.
Reported is a new complementary technique of full-swing BiCMOS circuit design which, though employs a p-n-p, allows the use of n-p-n-only drivers. The simulated results of this new circuit compare favorably among several representative BiCMOS circuits  相似文献   

16.
This paper presents a programmable digital finite-impulse response (FIR) filter for high-performance and low-power applications. The architecture is based on a computation sharing multiplier (CSHM) which specifically targets computation re-use in vector-scalar products and can be effectively used in the low-complexity programmable FIR filter design. Efficient circuit-level techniques, namely a new carry-select adder and conditional capture flip-flop (CCFF), are also used to further improve power and performance. A 10-tap programmable FIR filter was implemented and fabricated in CMOS 0.25-/spl mu/m technology based on the proposed architectural and circuit-level techniques. The chip's core contains approximately 130 K transistors and occupies 9.93 mm/sup 2/ area.  相似文献   

17.
This paper deals with the implementation of Full Adder chains by mixing different CMOS Full Adder topologies. The approach is based on cascading fast Transmission-Gate Full Adders interrupted by static gates having driving capability, such as inverters or Mirror Full Adders, thus exploiting the intrinsic low power consumption of such topologies. The obtained mixed-topology circuits are optimized in terms of delay by resorting to simple analytical models.Delay, power consumption and the Power-Delay Product (PDP) in both mixed-topology and traditional Full Adder chains were evaluated through post-layout Spectre simulations with a 0.35 μm, 0.18 μm and 90 nm CMOS technology considering different design targets, i.e., minimum power consumption, PDP, Energy-Delay Product (EDP) and delay. The results obtained show that the mixed-topology approach based on Mirror adders are capable of a very low power consumption (comparable to that of the low-power Transmission-Gate Full Adder) and a very high speed (comparable with or even greater than that of the very fast Dual-Rail Domino Full Adder). This also enables a high degree of design freedom, given that the same (mixed) topology can be used for a wide range of applications. This greater flexibility also affords a significant reduction in the design effort.  相似文献   

18.
A low-power, high-performance, 1024-point FFT processor   总被引:1,自引:0,他引:1  
This paper presents an energy-efficient, single-chip, 1024-point fast Fourier transform (FFT) processor. The 460000-transistor design has been fabricated in a standard 0.7 μm (Lpoly=0.6 μm) CMOS process and is fully functional on first-pass silicon. At a supply voltage of 1.1 V, it calculates a 1024-point complex FFT in 330 μs while consuming 9.5 mW, resulting in an adjusted energy efficiency more than 16 times greater than the previously most efficient known FFT processor. At 3.3 V, it operates at 173 MHz-which is a clock rate 2.6 times greater than the previously fastest rate  相似文献   

19.
In this paper, we present a noise-tolerant high-performance static circuit family suitable for low-voltage operation called skewed logic. Skewed logic circuits in comparison with Domino logic have better scalability and are more suitable for low voltage applications because of better noise margins. Skewed logic and its variations have been compared with Domino logic in terms of delay, power, and dynamic noise margin. A design methodology for skewed CMOS pipelined circuits has been developed. To demonstrate the applicability of the proposed logic style, 0.35 /spl mu/m 5.56 ns CMOS 16/spl times/16 bit multipliers have been designed using skewed logic circuits and fabricated through MOSIS. Measurement results show that the multiplier only consumed a power of 195 mW due to its low clock load.  相似文献   

20.
In this paper, we present a low-power high-performance digital predistorter (DPD) for the linearization of wideband RF power amplifiers (PAs). It is based on the novel FIR memory polynomial (FIR-MP) predistorter model, which significantly augments the performance of the conventional memory polynomial predistorter with the use of complex baseband digital FIR filter prior to the memory polynomial. The adjacent channel leakage ratio (ACLR) performance comparison between the conventional MP and the proposed FIR-MP is done based on simulations with multi-carrier modulated signals of 20 and 80 MHz bandwidths. The PA models used for the simulations are extracted from the measurements of a commercial \(1\,\hbox {W}\) GaAs HBT PA. At the ideal system-level simulations, the improvements in ACLR over the conventional MP are 7.2  and 15.6 dB, respectively, for 20 and 80 MHz signals. The choice of selection of various parameters of the predistorter along with the subsequent digital-to-analog converter (DAC) is presented. The impact of fixed-point representation is assessed using ACLR metrics, which shows that a wordlength of 14 bits is sufficient to obtain ACLR beyond \(45\,\hbox {dBc}\) with a margin of \(10\,\hbox {dB}\). The proposed predistorter is synthesized in \(28\,\hbox {nm}\) fully-depleted silicon-on-insulator (FDSOI) CMOS process. It is shown that with a fraction of the power and die area of that of the MP a huge improvement in ACLR is attained. With an overall power consumption of 8.2 and 88.8 mW, respectively, for 20 and 80 MHz signals, the FIR-MP DPD proves to be a suitable candidate for small-cell base station PA linearization.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号