首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 860 毫秒
1.
A 50-ns digital image signal processor (DISP)-an image/video application-specific VLSI chip-is discussed. This chip integrates 538 K transistors and dissipates 1.4 W at a 40-MHz clock. It is based on a 24-b fixed-point architecture with a five-stage pipeline. The DISP features a real-time processing capability realized by an enhanced parallel architecture, video-oriented data processing functions, and an instruction cycle time that is typically 35 ns, and 50 ns at worst. This 50-ns cycle time allows the DISP to execute mor than 60-million operations per second (MOPS). High-density 1.0-μm CMOS technology allows numerous on-chip features, including specified resources optimized for image processing. This allows a flexible hardware implementation of various algorithms for picture coding. Several circuit design techniques that are intended to attain a fast instruction cycle are reviewed, including distributed instruction decoding and a hierarchical clocking circuit. The LSI has been designed by the extensive use of a cell-based design method. The processor incorporates a sophisticated testing function compatible with a cell-based design environment  相似文献   

2.
An MPEG2 MP@ML video encoder large-scale integrated circuit (LSI) has been developed including an 81 MOPS controller and motion estimator. By using two adaptive algorithms, a wide motion-estimation search area (±288 pixels horizontal and ±96 pixels vertical) was achieved with computation complexity of only 0.5% (20 GOPS) of full search block-matching algorithm. By using this expanded motion-estimation search area, there is a significant improvement in picture quality for coding fast motion sequences. The power consumption was reduced by using an efficient pipeline architecture and optimizing the circuitry, especially in the motion-estimation block and the data transfers for the external SDRAM. The 13.7×12.4 mm2, 4.5-M transistor device using 0.4-μm CMOS technology dissipates 1.2 W at 3.3 V  相似文献   

3.
A charge coupled device (CCD)-based image processor that performs 2D filtering of a gray-level image with 20 programmable 8-b 7×7 spatial filters is described. The processor consists of an analog input buffer, 49 multipliers, and 49 8-b 20-stage local memories in a 29-mm 2 chip area. Better than 99.999% charge transfer efficiency and greater than 42-dB dynamic range have been achieved by the processor, which performs one billion arithmetic operations per second and dissipates less than 1 W when clocked at 10 MHz. The device is also suited for neural networks with local connections and replicated weights. Implementation of a specific neural network, the neocognitron, based on this CCD processor has been simulated. The effect of weight quantization imposed by use of this CCD device on the performance of the neocognitron is presented  相似文献   

4.
Low-power and low-voltage embedded microcontrollers are required more and more for portable applications. Power reduction can be addressed at the software level as well as at the architecture level while searching to reduce the number of executed instructions for a given task. An 8-b RISC-like pipelined microcontroller family is presented achieving one clock per instruction. It is compared to various architectures of existing 8-b microcontrollers. According to an efficiency model taking into account the architecture as well as the number of registers, the presented 8-b microcontroller cores provide four to ten times better performances than existing microcontrollers. On one hand, the operating frequency can be reduced to execute a given task in the same execution time. On the other hand, delivering 10 MIPS performance, more than 2000 MIPS/W can be achieved at 3 V  相似文献   

5.
A low-noise multibit sigma-delta analog-to-digital converter (ADC) architecture suitable for operation at low oversampling ratios is presented. The ADC architecture uses an efficient high-resolution pipelined quantizer while avoiding loop stability degradation caused by pipeline latency. A 16-b implementation of the architecture, fabricated in a 0.6-μm CMOS process, cascades a second-order 5-b sigma-delta modulator with a four-stage 12-b pipelined ADC and operates at a low 8X oversampling ratio. Static and dynamic linearity of the integrated ADC are improved through the use of dynamic element matching techniques and the use of bootstrapped and clock-boosted input switches. The ADC operates at a 20 MHz clock rate and dissipates 550 mW with a 5 V/3 V analog/digital supply. It achieves an SNR of 89 dB over a 1.25-MHz signal bandwidth and a total harmonic distortion (THD) of -98 dB with a 100-kHz input signal  相似文献   

6.
This paper examines the design of a 32-b GaAs Fast RISC microprocessor (F-RISC/I). F-RISC/I is a single chip GaAs Heterojunction MESFET (HMESFET) processor targeted for implementation on a multichip module (MCM) together with cache memories. The CPU architecture, circuit design. Implementation, and testing are optimized for a seven-stage instruction pipeline implemented with GaAs super-buffered FET logic (SBFL). We have been able to verify novel GaAs SBFL standard cells and compare measured CPU performance with performance estimates based on circuit and device models. The prototype 32-b microprocessor has been implemented using an automated standard cell approach because of time constraints and fabricated using an experimental process by Rockwell International. The CPU chip integrates 92340 transistors on a 7×7 mm2 die and dissipates 6.13 W at 180 MHz. Test results from a prototype fabrication run have demonstrated the operation of the ALU, the program counter, and the register file with delays below 6, 5, and 3.4 ns, respectively. The successful modeling and verification indicate that a 0.5 μm HMESFET implementation of F-RISC/I could achieve a peak performance of 350 MHz. The wiring delays account for 42% of the critical path delay  相似文献   

7.
A VLSI circuit has been developed that combines dual-ported RAMs and three high-speed 8-b digital-to-analog converters (DACs). It is known as a palette/DAC. A 6-2 segmented DAC architecture improves differential linearity and monotonicity. The current-source cell uses a cascode device to improve the DAC's linearity. A reference current, set by an on-chip bandgap reference voltage generator, and its associated distribution scheme eliminate the negative effects of threshold mismatches between current source cells, supply line resistance, and noise. The maximum conversion rate is 70 MHz with typical DC differential nonlinearity of 0.48 LSB (least significant bit). The 253-mil/SUP 2/ is designed on a double-metal CMOS process and consumes 1.2 W of power.  相似文献   

8.
Fully-differential current-mode circuit techniques are developed for the design of a pipelined current-mode analog-to-digital converter (IADC) in the standard CMOS digital processes. In the proposed IADC, the 1-b-per-stage architecture based on the reference nonrestoring algorithm is adopted. Thus large component ratios can be avoided and the linearity errors caused by device mismatches can be minimized. As one of the key subcircuits in the IADC, an offset-canceled high speed differential current comparator (CCMP) is proposed and analyzed. In the CCMP, the subtractions of offsets are performed in the current domain without floating capacitors. Moreover, the other key subcircuit, the current sample-and-hold amplifier (CSHA), is also developed to realize the pipeline architecture. An experimental chip for the proposed IADC has been fabricated in 0.8-μm n-well CMOS technology. Using a single 5-V power supply, the fabricated IADC can be operated at 4.5-Ms/s conversion rate with a signal-to-noise-and-distortion-ratio (SNDR) of 51 db (effective 8.2-b) for the input signal at 453 kHz. For 8-b resolution, the fabricated IADC can be operated at 4.5-Ms/s conversion rate with both differential nonlinearity (DNL) and integral nonlinearity (INL) below +/-0.6 LSB. The power consumption and the active chip area are 16 mW/b and 0.73 mm2/b, respectively  相似文献   

9.
A family of 8- and 10-b analog/digital converters (ADCs) has been designed using a more efficient architecture. The 10-b ADC requires two 4-b (two 3-b for the 8-b converter) half-flash cycles and a self-corrected voltage estimator. While the speed is similar to that of conventional half-flash ADCs, power consumption and die size are lower due to reduced numbers of comparators and resistors. The flash steps can be reduced by 1 b each, for an overall reduction in comparator count by a factor of 2. This architecture can be used to reduce the comparator and resistor count of any existing half-flash ADCs, ultimately decreasing die area and power consumption. For the same process and resolution, this architecture reduces die size and power consumption by 50%  相似文献   

10.
An 8-kb (128-word×64-b) CMOS associative memory with word and bit-parallel operation is described. The highly parallel and pipelined architecture is optimized for high-speed associative operations. The data processing capability is one word/cycle corresponding to 16 MIPS at a typical cycle time of 60 ns. The memory is fault tolerant under software control. A faulty word location in the memory can be made inaccessible by on-chip circuitry. The device is a complete single-chip associative memory with internally controlled addressing and associative data as output  相似文献   

11.
A parallel-pipelined A/D converter with an area and power efficient architecture is described. By sharing amplifiers along the pipeline and also completely eliminating the amplifier from the last stage, an 8-b pipeline is realized using just three amplifiers (instead of seven amplifiers with a conventional pipeline architecture). By using two such pipelines in parallel, a 52 Msamples/s prototype A/D converter that is Intended for a switched digital video application has been implemented in a 0.9-μm CMOS technology. The device occupies 15 mm 2 and dissipates 250 mW from a 5 V supply  相似文献   

12.
Fast and small squarers are needed in many applications such as image compression. A new family of high-performance parallel squarers based on the divide-and-conquer method is reported. Our main result was realized for the basis cases of the divide-and-conquer recursion by using optimized n-bit primitive squarers, where n is in the range of two to six. This method reduced the gate count and provided shorter critical paths. A chip implementing an 8-b squarer was designed, fabricated, and successfully tested, resulting in 24 million operations per second (MOPS) using a 2-μm CMOS fabrication technology. This squarer had two additional features: increased number of squaring operations per unit circuit area and the potential for reduced power consumption per squaring operation  相似文献   

13.
In this paper, we present: 1) design of a single-rail energy-efficient 64-b Han-Carlson ALU, operating at 482 ps in 1.5 V, 0.18-μm bulk CMOS; 2) direct port of this ALU to 0.18-μm partially depleted SOI process; 3) SOI-optimal redesign of the ALU using a novel deep-stack quaternary-tree architecture; 4) margining for max-delay pushout due to reverse body bias in SOI designs; and 5) performance scaling trends of the ALU designs in 0.13-μm generation. We show that a direct port of the Han-Carlson ALU to 0.18-μm SOI offers 14% performance improvement after margining. A redesign of the ALU, using an SOI-favored deep-stack architecture improves the margined speedup to 19%. A 10% margin was required for the SOI designs, to account for reverse body-bias-induced max-delay pushout. Preconditioning the intermediate stack nodes in the dynamic ALU designs reduced this margin to 2%. Scaling the ALUs to 0.13-μm generation reduces the overall SOI speedup for both architectures to 9% and 16%, respectively, confirming the trend that speedup offered by SOI technology decreases with scaling  相似文献   

14.
The authors present a monolithic 20-b analog-to-digital converter (ADC) based on an oversampling feedback architecture. The converter consists of a time-continuous integrator at the input, a pulsewidth modulator in the forward branch of the loop (corresponding to a 10-b ADC), and a 1-b DAC (digital-to-analog converter) to generate the feedback voltage. The digital evaluation is carried out with a uniformly weighted rectangular window filter. The circuit is implemented in a standard 2-μm CMOS n-well process and requires 14 mm2 of silicon, including the pads. Measurement results are presented that demonstrate the feasibility of this architecture for 20-b accuracy. The complete circuit has a power consumption of 6.7 mW  相似文献   

15.
The token-ring controller (TRC) consists of five functional blocks. they are a dedicated 16-b microprocessor which includes 11 K-word×20-b protocol firmware ROM, finite-state machines for real-time handling of frames, an 896-word×16-b dual-port RAM for frame buffer FIFOs and working memory (FIFO/RAM), a host processor bus interface, and a three-channel DMA controller which can follow list structure frame buffers. The TRC interprets and executes 16 types of commands and handles 23 types of media access control (MAC) frames. It can continuously receive more than 90% of incoming packets with 64-byte information length at 40 Mbit/s network speed. It is fabricated with double-metal-layer 1.2-μm CMOS technology and integrates 510 K MOSFETs in a 14.49-mm×14.62-mm chip area. The maximum power consumption is 0.945 W at 8-MHz operating frequency and 5-V±5% power supply low-power systems but also for high-performance applications  相似文献   

16.
A 28 mW/MHz at 80 MHz structured-custom RISC microprocessor design is described. This 32-b implementation of the PowerPC architecture is fabricated in a 3.3 V, 0.5 μm, 4-level metal CMOS technology, resulting in 1.6 million transistors in a 7.4 mm by 11.5 mm chip size. Dual 8-kilobyte instruction and data caches coupled to a high performance 32/64-b system bus and separate execution units (float, integer, loadstore, and system units) result in peak instruction rates of three instructions per clock cycle. Low-power design techniques are used throughout the entire design, including dynamically powered down execution units. Typical power dissipation is kept under 2.2 W at 80 MHz. Three distinct levels of software-programmable, static, low-power operation-for system power management are offered, resulting in standby power dissipation from 2 mW to 350 mW. CPU to bus clock ratios of 1×, 2×, 3×, and 4× are implemented to allow control of system power while maintaining processor performance. As a result, workstation level performance is packed into a low-power, low-cost design ideal for notebooks and desktop computers  相似文献   

17.
A CMOS folding and interpolating A/D conversion architecture fully compatible with standard digital CMOS technology is described. Fully-differential, continuous-time, current-mode, open-loop analog circuitry is used to achieve high speed. Results from 125 Ms/s 8-b and 150 Ms/s 6-b prototypes implemented in a digital 1 μm n-well CMOS process are presented. The 8-b (6-b) converter occupies 4 mm2 (2 mm2) and dissipates 225 mW (55 mW) from a single 5 V power supply  相似文献   

18.
The computation of the product of two digital numbers by discrete convolution by a surface acoustic wave (SAW) convolver is described. The principal limitations of the method are discussed as well as realistic performance numbers that can be achieved by using available SAW device technology. The multiplication is done by convolving the 2-b streams that represent two digital input operands in a SAW convolver at a rate of 100 MHz. The convolver output is the product in mixed binary representation. It consists of one analog value per digit with a resolution of 0.5%. These values are digitized at a rate of 200 MHz by an 8-b flash analog-to-digital converter and added up to form the digital result. It is shown that such a device has a very high computing power that can be adapted to special applications. One extreme is the multiplication of 256-b integer numbers, corresponding to a dynamic range of 1:1077, at a rate of 0.2 MIPS. The other extreme is the computation of matrix products where computing speeds of 12000 MIPS can be achieved for 8-b operands, as required, for example, in image filtering applications  相似文献   

19.
The modulator IC is a mixed analog/digital transceiver component in a chip set that is designed for the hand-held terminals of the pan-European 900-MHz Groupe Special Mobile (GSM) digital cellular radio network. The concept of the radio-frequency environment in which the circuit is used is explained, focusing on the differences in existing systems. The architecture and different functions of the modulator circuit and details of the digital and analog processing in the transmission mode are discussed. The receiving mode, which is mostly based on analog processing, is highlighted. The device generates Gaussian minimum-shift-keying (GMSK) modulation and converts the received signal to 8-b words after filtering. The modulator IC uses digital waveform generation and a quadrature signal representation. This device is implemented in a 1.5-μm CMOS technology. The power consumption is less than 35 mW from a 5-V supply  相似文献   

20.
A family of user-programmable peripherals, utilizing an integration strategy based on a programmable system device (PSD) concept, is described. Specifically, PSD is an efficient and highly configurable integration of high-density memory and LSI level logic blocks. The configurability is derived by providing programmable logic and programmable interconnect. PSDX is the first PSD family of programmable microcontroller peripherals; it integrates 256 kb to 1 Mb of EPROM, 16 kb of SRAM, a 28-input by 42-product term programmable logic device (PLD), and flexible I/O ports. This family is primarily targeted for embedded microcontroller applications. Using one PSD device it is possible to replace all the core peripherals in the system and, as a result, achieve a reduction in components, power dissipation, and overall system cost. The flexible architecture is achieved by providing 46 configuration options, which allows the PSD to interface with virtually any 8- or 16-b microcontroller. The integration is made possible by developing a special configurability and testability scheme. These parts are realized on a 1.2-μm CMOS EPROM process  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号