首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
本文将介绍相关多媒体处理器的电源管理技术原理,以及如何利用这些技术降低功耗,并讨论采用哪些外部电源管理器件和功率IC可确保处理器芯片全面发挥省电特性.  相似文献   

2.
This paper presents the results of a study of alternative adder architectures, a full-swing Bipolar Double Pass-Transistor adder, a new full-swing BiNMOS adder, a reduced-swing Bipolar Double Pass-Transistor adder and a reduced-swing Double Pass-Transistor BiNMOS adder, that outperform a standard CMOS adder up to three times in power-efficiency at supply voltages 1.5–3 V. The Bipolar Double Pass-Transistor adder is more power-efficient than a standard CMOS adder even at a fanout of 1. All remaining proposed adders have a lower crossover capacitance with a standard CMOS adder than the previously reported low-voltage adders. Circuits were designed and fabricated in 0.8 μm BiCMOS technology.  相似文献   

3.
In terms of speed, the Wallace-tree compressor (i.e. bit-level carry-save addition array) is widely recognised as one of the most effective schemes for implementing arithmetic computations in VLSI design. However, the scheme has been applied only in a rather restrictive way, i.e. for implementing fast multipliers and for generating fixed structures without considering the characteristic of the input signals. The authors address the problem of optimising arithmetic circuits to overcome those limitations. A polynomial time algorithm is presented which generates a delay-optimal carry-save addition structure of an arithmetic circuit with uneven signal arrival profiles. This algorithm has been applied to the optimisation of high-speed digital filters and 5-30% savings have been achieved in the overall filter implementation in comparison to the standard carry-save implementation  相似文献   

4.
In this paper, we present a novel scheme for performing fixed-point arithmetic efficiently on fine-grain, massively parallel, programmable architectures including both custom and FPGA-based systems. We achieve anO(n) speedup, wheren is the operand precision, over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2, within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possible by means of digit pipelined algorithms which avoid broadcast and which operate in a fully systolic manner by pipelining at the digit level. A base 4, signed-digit, fully redundant number system and on-line techniques are used to limit carry propagation and minimize communication costs. p ]Although our algorithms are digit-serial, we are able to match the performance of the bit-parallel methods, while retaining low communication complexity. Reconfigurable hardware systems built using field programmable gate arrays (FPGA's) can share in the speed benefits of these algorithms. By using the organization of logic blocks suggested in this paper, problems of placement and routing that exist in such systems can be avoided. Since the algorithms are amenable to pipelining, very high throughput can be obtained.  相似文献   

5.
《现代电子技术》2016,(10):92-95
为了保证嵌入式设备运行的稳定性和可靠性,都会应用双余度的CPU来共同管理硬件资源,协调任务调度和处理CPU的高速外设接口数据,因此,该文介绍一种在具有高效数字时钟管理器的FPGA上产生高精度、高稳定度时钟同步信号,用来保证CPU间的精确同步通信,达到高效的公共资源管理、合理的任务调度以及相互比对的数据计算。  相似文献   

6.
7.
Shin  K. Kim  T. 《Electronics letters》2004,40(7):415-417
A new approach to the synthesis of arithmetic circuits to minimise leakage power consumption under circuit timing constraint is presented. This is believed to be the first work that addresses the minimisation of leakage power consumption in RTL synthesis of arithmetic circuits. The leakage optimisation is based on the use of dual-threshold voltage (V/sub t/) technology. The proposed approach is performed in two phases: (i) a timing-driven synthesis and placement technique is applied to an arithmetic expression using FA/HA cells with high-V/sub t/ (i.e. slower but lower leakage power than that of low-V/sub t/) to produce a synthesis and placement result with least leakage power consumption; (ii) a technique of minimally replacing the FA/HA cells with high-V/sub t/ from the result in (i) by FA/HA cells with low-V/sub t/ (i.e. more leakage power but faster than that of high-V/sub t/) to meet the timing constraint of the circuit is applied. Experiments using a set of benchmark designs have shown the approach is quite effective, producing designs with on average 34.6% less leakage power over the conventional method without increasing circuit delay.  相似文献   

8.
Hong  S. 《Electronics letters》2007,43(19):1017-1018
A DRAM architecture capable of providing dual-port interface is presented. The architecture utilises a novel global bitline scheme to obtain a very wide data bandwidth not possible using traditional DRAM architectures. Furthermore, the area penalty is minimised by using a conventional one-transistor one-capacitor cell coupled with special sensing units that have 84.6% more transistor count. The architecture allows simultaneous read and write access using a conventional two-metal DRAM fabrication process.  相似文献   

9.
This article introduces two new configurations for precision current sources and current mirrors. Both circuits use n-p-n bipolar devices, but one provides a current source while the other provides a current sink. This permits the use of current sources in IC designs implemented with processes that do not allow p-n-p devices. Analog building blocks such as voltage-to-current converters, active loads, and sink and source current mirrors can be constructed from the new circuits using only n-p-n devices. Furthermore, because these circuits achieve the required precision without the use of high-gain amplifiers, the bandwidths of the circuits are large compared to conventional configurations. The source currents are dependent on a single reference voltage and exhibit good temperature sensitivity.  相似文献   

10.
This paper deals with the implementation of Full Adder chains by mixing different CMOS Full Adder topologies. The approach is based on cascading fast Transmission-Gate Full Adders interrupted by static gates having driving capability, such as inverters or Mirror Full Adders, thus exploiting the intrinsic low power consumption of such topologies. The obtained mixed-topology circuits are optimized in terms of delay by resorting to simple analytical models.Delay, power consumption and the Power-Delay Product (PDP) in both mixed-topology and traditional Full Adder chains were evaluated through post-layout Spectre simulations with a 0.35 μm, 0.18 μm and 90 nm CMOS technology considering different design targets, i.e., minimum power consumption, PDP, Energy-Delay Product (EDP) and delay. The results obtained show that the mixed-topology approach based on Mirror adders are capable of a very low power consumption (comparable to that of the low-power Transmission-Gate Full Adder) and a very high speed (comparable with or even greater than that of the very fast Dual-Rail Domino Full Adder). This also enables a high degree of design freedom, given that the same (mixed) topology can be used for a wide range of applications. This greater flexibility also affords a significant reduction in the design effort.  相似文献   

11.
12.
13.
《Microelectronics Journal》2007,38(4-5):482-488
This paper presents the design of high performance and low power arithmetic circuits using a new CMOS dynamic logic family, and analyzes its sensitivity against technology parameters for practical applications. The proposed dynamic logic family allows for a partial evaluation in a computational block before its input signals are valid, and quickly performs a final evaluation as soon as the inputs arrive. The proposed dynamic logic family is well suited to arithmetic circuits where the critical path is made of a large cascade of inverting gates. Furthermore, circuits based on the proposed concept perform better in high fanout and high switching frequencies due to both lower delay and dynamic power consumption. Experimental results, for practical circuits, demonstrate that low power feature of the propose dynamic logic provides for smaller propagation time delay (3.5 times), lower energy consumption (55%), and similar combined delay, power consumption and active area product (only 8% higher), while exhibiting lower sensitivity to power supply, temperature, capacitive load and process variations than the dynamic domino CMOS technologies.  相似文献   

14.
The split Schur algorithms of P. Delsarte and Y. Genin (1987) represent methods of computing reflection coefficients that are computationally more efficient, in terms of multiplications, than the conventional Schur algorithm by a constant factor. The authors investigate the use of fixed-point binary arithmetic, with quantization due to rounding, in the implementation of the symmetric and antisymmetric split Schur algorithms. It is shown, through a combination of analysis and simulation, that the errors in the reflection coefficient estimates due to quantization are large when the input signal is either a narrowband high-pass signal or a narrowband low-pass signal  相似文献   

15.
VLSI-oriented multiple-valued current-mode MOS arithmetic circuits using radix-2 signed-digit number representations are proposed. A prototype adder chip is implemented with 10-μm CMOS technology to confirm the principle of operation. A multiplication scheme using four-input current-mode wired summations for realizing a high-speed small-size multiplier is presented. The 32×32-b multiplier is composed of 18800 transistors and required fewer interconnections. The multiply time is estimated to be 45 ns by SPICE simulation in 2-μm CMOS technology. It is shown that the technology is also potentially effective for the reduction of the data-bus area in VLSI  相似文献   

16.
H.264/AVC is the latest video coding standard adopting variable block size motion estimation (VBS-ME), quarter-pixel accuracy, motion vector prediction and multi-reference frames for motion estimation. These new features result in much higher computation requirements than previous coding standards. In this paper we propose a novel most significant bit (MSB) first bit-serial architecture for full-search block matching VBS-ME, and compare it with systolic implementations. Since the nature of MSB-first processing enables early termination of the sum of absolute difference (SAD) calculation, the average hardware performance can be enhanced. Five different designs, one and two dimensional systolic and tree implementations along with bit-serial, are compared in terms of performance, pixel memory bandwidth, occupied area and power consumption.
Philip H. W. Leong (Corresponding author)Email:
  相似文献   

17.
Recently introduced linearly independent arithmetic (LIA) transforms and their corresponding spectral coefficients are used to detect faults in digital circuits. The results show that for many classes of logical functions, the LIA logic transformations are advantageous in terms of the number of their coefficients that have to be checked to identify the faults when compared to the case of the well-known arithmetic transform  相似文献   

18.
Precision monolithic circuit functions can be fabricated if proper circuit and processing techniques are incorporated. Compatible thin-film-on- silicon technology is coupled with annealing of individual resistors with a focused laser beam to provide a means for fabricating monolithic circuits in the 0.1 to 1.0 percent accuracy range. It is shown that special circuit techniques, which accommodate uniquenesses of available monolithic devices, must be used to obtain optimum circuit performance. The design of an instrumentation amplifier that incorporates emitter feedback and direct compensation of dc errors is described, together with test results that validate the techniques discussed.  相似文献   

19.
In this paper, an organization of FFT processor has been described using the idea of memory segmentation and interstage shuffling which results in a reduction of memory read/write time by 50 percent. Two schemes have been presented, one using only one arithmetic unit and the other with two arithmetic units.  相似文献   

20.
This article describes some of our recent work in the development of computer architectures for efficient execution of artificial neural network algorithms. Our earlier system, the Ring Array Processor (RAP), was a multiprocessor based on commercial DSPs with a low-latency ring interconnection scheme. We have used the RAP to simulate variable precision arithmetic to guide us in the design of arithmetic units for high performance neurocomputers to be implemented with custom VLSI. The RAP system played a critical role in this study, enabling us to experiment with much larger networks than would otherwise be possible. Our study shows that back-propagation training algorithms only require moderate precision. Specifically, 16b weight values and 8b output values are sufficient to achieve training and classification results comparable to 32b floating point. Although these results were gathered for frame classification in continuous speech, we expect that they will extend to many other connectionist calculations. We have used these results as part of the design of a programmable single chip microprocessor, SPERT. The reduced precision arithmetic permits the use of multiple arithmetic units per processor. Also, reduced precision operands make more efficient use of valuable processor-memory bandwidth. For our moderate-precision fixed-point arithmetic applications, SPERT represents more than an order of magnitude reduction in cost over systems with equivalent performance that use commercial DSP chips.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号