期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A pipelined 50-MHz CMOS 64-bit floating-point arithmetic processor

Benschneider B.J. Bowhill W.J. Copper E.M. Gavrielov M.N. Gronowski P.E. Maheshwari V.K. Peng V. Pickholtz J.D. Samudrala S. 《Solid-State Circuits, IEEE Journal of》1989,24(5):1317-1323

A 135K transistor, uniformly pipelined 50-MHz CMOS 64-bit floating-point arithmetic processor chip is described. The execution unit is capable of sustaining pipelined performance of one 32-bit or 64-bit result every 20 ns for all operations except double-precision multiply (40 ns) and divide. The chip employs an exponent difference prediction scheme and a unified leading-one and sticky-bit computation logic for the addition and subtraction operations. A hardware multiplier using a radix-8 modified Booth algorithm and a divider using a radix-2 SRT algorithm are employed.<> 相似文献

2.

A 15-b pipelined CMOS floating-point A/D converter

Thompson D.U. Wooley B.A. 《Solid-State Circuits, IEEE Journal of》2001,36(2):299-303

A floating-point approach can be used to extend the dynamic range of analog-to-digital (A/D) converters in applications where large signals need not be encoded with a precision greater than that required for small signals. Owing to the nonuniform nature of the quantization in a floating-point A/D converter (FADC), it is possible to sacrifice a large peak signal-to-noise ratio to obtain savings in power dissipation and area while achieving a large dynamic range. A 15-b switched-capacitor pipelined FADC has been designed with a 10-b mantissa and an exponent that provides an additional 5 bits of dynamic range. The increased dynamic range is obtained with a three-stage pipelined variable gain amplifier, while the mantissa is determined by a uniform 10-b pipelined A/D converter. An experimental prototype of the converter has been integrated in a 0.5 μm CMOS technology. It achieves a dynamic range of 90 dB at a conversion rate of 20 MSamples/s with a total power dissipation of 380 mW 相似文献

3.

A floating-point cell library and a 100-Mflops image signal processor

Fujii H. Hori C. Takada T. Hatanaka N. Demura T. Ootomo G. 《Solid-State Circuits, IEEE Journal of》1992,27(7):1080-1088

相似文献

4.

The microarchitecture of the synergistic processor for a cell processor 总被引：1，自引：0，他引：1

Flachs B. Asano S. Dhong S.H. Hofstee H.P. Gervais G. Roy Kim Le T. Peichun Liu Leenstra J. Liberty J. Michael B. Hwa-Joon Oh Mueller S.M. Takahashi O. Hatakeyama A. Watanabe Y. Yano N. Brokenshire D.A. Peyravian M. Vandung To Iwata E. 《Solid-State Circuits, IEEE Journal of》2006,41(1):63-70

This paper describes an 11 FO4 streaming data processor in the IBM 90-nm SOI-low-k process. The dual-issue, four-way SIMD processor emphasizes achievable performance per area and power. Software controls most aspects of data movement and instruction flow to improve memory system performance and core performance density. The design minimizes instruction latency while providing for fine grain clock control to reduce power. 相似文献

5.

A digital processor for full calibration of pipelined ADCs

Mohammad Fardad Javad Frounchi Ghader Karimian 《Analog Integrated Circuits and Signal Processing》2012,70(3):347-356

In this paper, a digital processor is presented for full calibration of pipeline ADCs. The main idea is to find an inverse model of ADC errors by using small number of the measured codes. This approach does not change internal parts of the ADC and most known errors are compensated simultaneously by digital post-processing of the output bits. Some function approximation algorithms are tested and their performances are evaluated. To verify the algorithms, a 12-bit pipelined ADC based on 1.5-bit per stage architecture is simulated with 1%-2% non-ideal factors in the SIMULINK with a 20 MHz sinusoidal input and a 100 MS/s sampling frequency. The selected algorithm has been implemented on a Virtex-4 LX25 FPGA from Xilinx. The designed processor improves the SNDR from 45 to 69 dB and increases the SFDR from 45.5 to 90 dB. The calibration processor also improves the integral nonlinearity of the ADC. 相似文献

6.

一种精简结构的浮点蝶形运算单元设计

于龙洋段文伟李署坚《电讯技术》2011,51(9):73-77

论述了一种结构精简且高效的浮点数蝶形运算单元设计,单元内部模块的使用效率接近100%。采用串行全流水线结构设计,与并行结构相比节省了75%的硬件资源消耗。利用按时间抽取(DIT)的快速傅里叶变换(FFT)算法,通过VHDL编程实现了以该蝶形单元为基础的1 024点浮点FFT处理器。QUARTUS II中的仿真结果证明了设计的正确性。该设计已成功应用于一种音频信号分析仪的信号处理部分。相似文献

7.

Power optimization for the datapath of a 32-bit reconfigurable pipelined DSP processor 总被引：1，自引：0，他引：1

Han Liang Chen Jie Chen Xiaodong 《电子科学学刊(英文版)》2005,22(6):650-657

With the continuous increasing of circuit scale, the problem of power consumption is paid much more attention than before, especially in large designs. In this paper, an experience of optimizing the power consumption of the 16-bit datapath in a 32-bit reconfigurable pipelined Digital Signal Processor （DSP） is introduced. By keeping the old input values and preventing the useless switching of the logic blocks on the datapath, the power consumption is much lowered. At the same time, by relocating some logic blocks between different pipeline stages and employing some data forward logics, a better balanced pipeline is achieved to lower the power consumption for conditional computation instructions at very low timing and area costs. The effectivity of these power optimization technologies are proved by the experimental results. Finally, some ideas about how to reduce the power consumption of circuits are proposed, which are very effective and useful in practice designs, especially in pipelined ones. 相似文献

8.

Mixed-signal design of a fully parallel fuzzy processor

Baturone I. Barriga A. Sanchez-Solano S. Huertas J.L. 《Electronics letters》1998,34(5):437-438

The authors present a novel architecture for implementing general-purpose fuzzy chips which allows fully-parallel rule processing employing a reduced number of mixed-signal computing blocks and minimum-sized digital memories. The resulting fuzzy processor can interact directly with continuous sensors and actuators and the subsequent digital processing system 相似文献

9.

一种用于SDH支路净荷处理器的设计与实现 总被引：1，自引：1，他引：0

黄海生《光通信技术》2008,32(3):53-55

讨论了支路净荷处理器的实现方案.由于采用特殊的时分复用技术使得电路的规模小、功耗低,可靠性高;经硬件实验证实,电路的性能指标完全可以满足ITU-T的有关标准.采用这种设计方法对系统集成有明显的优势. 相似文献

10.

A reconfigurable 4-GS/s power-efficient floating-point FFT processor design and implementation based on single-sided binary-tree decomposition

《Integration, the VLSI Journal》2019

This paper presents a high throughput size-configurable floating point (FP) Fast Fourier Transform (FFT) processor, having implemented the 8-parallel multi-path delay feedback (MDF) functions suitable for applications in the real-time radar imaging system. With regard to floating-point FFT design, to acquire a high throughput with restricted area and power consumptions poses as a greater challenge due to some higher degrees of complexity involved in realizing of FP operations than those fixed-point counterparts. To address the related issues, a novel mixed-radix FFT algorithm featuring the single-sided binary-tree decomposition strategy is proposed aiming at effectively containing the complexity of multiplications for any 2^k-point FFT. To this aid, the parallel-processing twiddle factor generator and the dual addition-and-rounding fused FP arithmetic units are optimized to meet the high accuracy demand in computation and the low power budget in implementation. The proposed FP FFT processor has been designed in silicon based on SMIC's 28 nm CMOS technology with the active area of 1.39 mm². The prototype design delivers a throughput of 4 GSample/s at 500 MHz, at a peak power consumption of 84.2 mW. Thus, the proposed design approach achieves a significant improvement in power efficiency approximately by 14 times on average over some other FP FFT processors previously reported. 相似文献

11.

New approaches toward the fully digital integrated management of a burn unit

Reina-Tosina J Roa LM Cáceres J Gómez-Cía T 《IEEE transactions on bio-medical engineering》2002,49(12):1470-1476

相似文献

12.

A background calibration in pipelined ADCs

Hamid R. Mafi Amir M. Sodagar 《AEUE-International Journal of Electronics and Communications》2013,67(8):729-732

In this paper, a novel background calibration is presented. The proposed scheme continuously measures and digitally compensates conversion errors caused by residue amplifier nonlinearity. This scheme can be used to relax analog circuit requirements for high-precision residue amplifier, accordingly decreasing the power consumption and/or increasing sampling rates in pipelined ADCs. The proposed scheme employs a fifth-order polynomial to eliminate conversion errors. One unique feature of the proposed scheme is that a single pseudorandom sequence, pn, is exploited. The simulation results show that, using the proposed calibration technique, the signal-to-noise-and-distortion-ratio (SNDR) is improved from 40 to 66 dB and the spurious-free-dynamic-range (SFDR) is increased from 48 to 80 dB. 相似文献

13.

A design of a fast pipelined modular multiplier based on a diminished-radix algorithm

Glenn Orton Lloyd Peppard Stafford Tavares 《Journal of Cryptology》1993,6(4):183-208

We present a new serial-parallel concurrent modular-multiplication algorithm and architecture suitable for standard RSA encryption. In the new scheme, multiplication is performed modulo a multiple of the RSA modulus n, which has a diminished-radix form 2^k-v, where k and v are positive integers and v < n. This design is the first concurrent modular multiplier to use a diminished-radix algorithm and to pipeline concurrent modular-reduction to optimize the clock rate. For a modular multiplier of order ranging from 1 to 10 (number of multiplier bits per clock cycle), a faster clock rate and throughput is possible than with other known designs including those of Brickell, Morita, Sedlak and Golze, and Miyaguchi. Throughput estimates for 512-bit RSA decryption range from 100 kbit/s in a serial mode to 650 kbit/s with a modular multiplier of order 10, at a clock rate of 20 MHz on 1.5 m CMOS. 相似文献

14.

A new structure of substage in pipelined analog-to-digital converters

JIA Hua-yu CHEN Gui-can ZHANG Hong 《中国邮电高校学报(英文版)》2009,16(1):86-90

The article presents a new (1+1)-bit/stage structure for pipelined analog-to-digital converters (ADC). When the input analog signal of the structure exceeds the converting range of the whole ADC, the signal can still be converted precisely and the output residue voltage of the structure will be in the converting range of the ADC. The structure is used in a 12-bit 40 MS/s pipelined ADC to test its function. The testing results show that the structure has right function and can correct the transition error induced by offset of comparators' decision levels. The ADC implemented in Semiconductor Manufactory International Corporation (SMIC) 0.18 μm CMOS process consumes 210 mW and occupies a chip area of 3.2×3.7 mm2. 相似文献

15.

A novel illuminator design in a rapid thermal processor

Min Hung Lee Chee Wee Liu 《Semiconductor Manufacturing, IEEE Transactions on》2001,14(2):152-156

The instantaneous insertion of an opaque shutter between the lamp arrays and the wafer in a rapid thermal processor can significantly increase the ramp-down rate from 90 to 400°C/s during the cooling period. This shutter can prevent the residual heating of lamp filament as well as the self-heating from the reflector due to the mirror image of the wafer. To compensate for the weak irradiation intensity close to the edge of the linear lamps, a multiplane reflector design is used to increase the uniformity of irradiation intensity in the direction along the linear lamps. The distance between the reflector plane and the linear lamp is designed to be smaller at the edge, as compared to the center, of the linear lamp. Together with two oblique reflectors at the ends of the linear lamps, a typical three-plane reflector design can increase the uniformity by 60% in a typical lamp configuration 相似文献

16.

适用于FPGA的浮点型DSP硬核结构设计

下载免费PDF全文

赵赫黄志洪余乐杨海钢许仕龙郝亚男《太赫兹科学与电子信息学报》2019,17(3):524-530

提出一种浮点型数字信号处理器(DSP)硬核结构,在兼容定点数运算的同时,也为浮点数运算提供较好支持。目前各大现场可编程门阵列(FPGA)主流厂商在实现浮点数运算功能时均采用软核实现方式,即将浮点数运算算法映射到芯片上,通过逻辑资源和DSP模块实现。相比于传统方法,提出的硬核结构在不占用FPGA中其他逻辑资源情况下,仅利用DSP模块便能完成浮点数运算。设计中,充分考虑负载和时延影响,插入多级流水线,显著提高浮点数的计算效率。采用中芯国际(MCI)28 nm工艺设计并完成所提出的浮点型DSP硬核结构。仿真结果表明,所提出的硬核结构的单个浮点数加法和乘法效率为0.4 Gflops。相似文献

17.

A photonic front-end processor in a WDM ATM multicast switch

Chao H.J. Wu L. Zhang Z. Yang S.H. Wang L.M. Chai Y. Fan J.Y. Choa F.S. 《Lightwave Technology, Journal of》2000,18(3):273-285

Dense wavelength-division multiplexing (DWDM) technology has provided tremendous transmission capacity in optical fiber communications. However, switching and routing capacity is still far behind transmission capacity. This is because most of today's packet switches and routers are implemented using electronic technologies. Optical packet switches are the potential candidate to boost switching capacity to be comparable with transmission capacity. In this paper, we present a photonic asynchronous transfer mode (ATM) front-end processor that has been implemented and is to be used in an optically transparent WDM ATM multicast (3M) switch. We have successfully demonstrate the front-end processor in two different experiments. One performs cell delineation based on ITU standards and overwrites VCI/VPI optically at 2.5 Gb/s. The other performs cell synchronization, where cells from different input ports running at 2.5 Gb/s are phase-aligned in the optical domain before they are routed in the switch fabric. The resolution of alignment is achieved to the extent of 100 ps (or 1/4 bit). An integrated 1×2 Y-junction semiconductor optical amplifier (SOA) switch has been developed to facilitate the cell synchronizer 相似文献

18.

A block floating-point treatment to the LMS algorithm: efficient realization and a roundoff error analysis

Mitra A. Chakraborty M. Sakai H. 《Signal Processing, IEEE Transactions on》2005,53(12):4536-4544

An efficient scheme is presented for implementing the LMS-based transversal adaptive filter in block floating-point (BFP) format, which permits processing of data over a wide dynamic range, at temporal and hardware complexities significantly less than that of a floating-point processor. Appropriate BFP formats for both the data and the filter coefficients are adopted, taking care so that they remain invariant to interblock transition and weight updating operation, respectively. Care is also taken to prevent overflow during filtering, as well as weight updating processes jointly, by using a dynamic scaling of the data and a slightly reduced range for the step size, with the latter having only marginal effect on convergence speed. Extensions of the proposed scheme to the sign-sign LMS and the signed regressor LMS algorithms are taken up next, in order to reduce the processing time further. Finally, a roundoff error analysis of the proposed scheme under finite precision is carried out. It is shown that in the steady state, the quantization noise component in the output mean-square error depends on the step size both linearly and inversely. An optimum step size that minimizes this error is also found out. 相似文献

19.

Prenormalization rounding in IEEE floating-point operations using a flagged prefix adder

Burgess N. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(2):266-277

This paper demonstrates how IEEE 754 floating-point standard compliant rounding can be merged with carry-propagate addition in floating-point unit (FPU) designs by using a novel adaptation of the prefix adder. The paper considers add/subtract, multiply, and SRT divide operations and demonstrates that in every case a generic rounding architecture based on a prefix adder with a small amount of additional logic is sufficient to cover all the rounding modes. Critical path analysis shows that the proposed architecture is compatible with contemporary pipelined FPU design practice, while using significantly less logic 相似文献

20.

A method of processor selection for interrupt handling in a multiprocessor system

《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1966,54(12):1812-1819

A method of assigning external interrupts to processors in a multiprocessor system is described. Features of a multilevel priority interrupt system are incorporated into a hardware component called the Interrupt Directory. The directory selects the most appropriate processor for servicing the interruption at the time the event occurs. The "appropriateness" for interruption is based on the priority level of a processor's current task, thus providing dynamic priority allocation of tasks. Queueing of interrupts is also provided. The arrangement described in this paper simplifies and increases the effectiveness of executive control programs. Implications of the Interrupt Directory on reliability and "fail-soft" operation are also discussed. 相似文献