期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Design and analysis of high-speed 8-bit ALU using 18 nm FinFET technology

Shylashree N. Venkatesh B. Saurab T. M. Srinivasan Tarun Nath Vijay 《Microsystem Technologies》2019,25(6):2349-2359

All modern computational devices consist of ALU. With increase in complexity of software and the consistent shift of software towards parallelism, high speed processors with hardware support for time consuming operations such as multiplication would benefit. Smaller, compact devices such as IoT devices need to run software such as security software and be able to offload computation cost from the cloud. In this paper, a high speed 8-bit ALU using 18 nm FinFET technology is proposed. The arithmetic and logical unit consists of fast compute units such as Kogge Stone fast adder and Dadda multiplier along with basic logic gates. In this paper, an ALU with each compute unit optimized for speed is proposed, while responsibly consuming area. Dadda multiplier is of 8 × 8 architecture as opposed to conventional approach of 4 × 4 making it a true 8-bit ALU. Simulation and analysis is done using Cadence Virtuoso in Analog Design Environment. The transistor count of proposed design is 5298, the power consumption is 219 µW and maximum delay is 166.8 ps. The design is also expected to consume a maximum of one clock cycle for any computation.

相似文献

2.

基于改进4-2压缩结构的32位浮点乘法器设计

邵磊李昆张树丹于宗光徐睿《微计算机信息》2007,23(9)

本文介绍一种用于高性能DSP的32位浮点乘法器设计,通过采用改进Booth编码的树状4-2压缩器结构,提高了速度,降低了功耗,该乘法器结构规则且适合于VLSI实现,单个周期内完成一次24位整数乘或者32位浮点乘。整个设计采用Verilog HDL语言结构级描述,用0.25um单元库进行逻辑综合.完成一次乘法运算时间为24.30ns. 相似文献

3.

基于改进4—2压缩结构的32位浮点乘法器设计

邵磊李昆张树丹于宗光徐睿《微计算机信息》2007,23(3X):224-225,199

本文介绍一种用于高性能DSP的32位浮点乘法器设计,通过采用改进Booth编码的树状4-2压缩器结构,提高了速度,降低了功耗,该乘法器结构规则且适合于VLSI实现,单个周期内完成一次24位整数乘或者32位浮点乘。整个设计采用Verilog HDL语言结构级描述,用0.25um单元库进行逻辑综合.完成一次乘法运算时间为24.30ns. 相似文献

4.

异步处理器设计中的关键技术研究

高玲祝翔李鸥《微计算机信息》2006,83(8):224-226

异步处理器解决了传统的同步处理器时钟偏移的问题,具有低功耗和高并行性等优点。本文着重分析了设计异步处理器的关键技术及实现方法,分析比较了当前异步处理器的实现方式,指出了异步处理器的研究方向和重点。并展望了异步处理器技术在媒体处理领域中的应用。异步处理器虽然还没有得到实际的广泛应用,但具有很高的研究价值。相似文献

5.

A real-time vision system using an integrated memory array processor prototype

Yoshihiro Fujita Nobuyuki Yamashita Shin'ichiro Okazaki 《Machine Vision and Applications》1994,7(4):220-228

This paper describes a real-time vision system (RVS) architecture and performance and its use of an integrated memory array processor (IMAP) prototype. This prototype integrates eight 8-bit processors and a 144-kbit SRAM on a single chip. The RVS was developed with 64 IMAP prototypes connected in series in a 512 processor-system configuration. A host workstation can access the memory on the IMAP prototypes directly through a random access port. Images are inputted and outputted at high speed through serial access ports. The RVS performance is shown in real-time road-image processing and in a neural network simulation, as well as in low-level image processing algorithms, such as filtering, histograms, discrete cosine transform (DCT), and rotation. The RVS image processing is shown to be much faster than the video rate. 相似文献

6.

Interactive 1-bit feedback segmentation using transductive inference

Ding-Jie Chen Hwann-Tzong Chen Long-Wen Chang 《Machine Vision and Applications》2018,29(4):617-631

This paper presents an effective algorithm, interactive 1-bit feedback segmentation using transductive inference (FSTI), that interactively reasons out image segmentation. In each round of interaction, FSTI queries the user one superpixel for acquiring 1-bit user feedback to define the label of that superpixel. The labeled superpixels collected so far are used to refine the segmentation and generate the next query. The key insight is treating the interactive segmentation as a transductive inference problem, and then suppressing the unnecessary queries via an intrinsic-graph-structure derived from transductive inference. The experiments conducted on five publicly available datasets show that selecting query superpixels concerning the intrinsic-graph-structure is helpful to improve the segmentation accuracy. In addition, an efficient boundary refinement is presented to improve segmentation quality by revising the misaligned boundaries of superpixels. The proposed FSTI algorithm provides a superior solution to the interactive image segmentation problem is evident. 相似文献

7.

A 4-GS/s 8-bit two-channel time-interleaved folding and interpolating ADC

JIANG Fan WU DanYu ZHOU Lei WU Jin JIN Zhi LIU XinYu 《中国科学:信息科学(英文版)》2014,57(1):1-6

Ultra high speed and moderate resolution ADCs with low latency are demanded in many applications.A 4-GS/s 8-bit ADC is implemented in the 0.35μm SiGe BiCMOS technology.It is based on the two-channel time-interleaved architecture and each sub-ADC employs the two-stage cascaded folding and interpolating topology which guarantees the low-latency property.Calibration circuits are introduced to compensate for the mismatch between the two sub-ADCs.The whole chip area is about 4.0×4.0(mm2).The ADC exhibits DNL of 0.26/0.34 LSB and INL of 0.96/0.92 LSB.The ENOB is 7.1 bits and the SFDR is about 56 dB at10.1 MHz input.The SNDR is above 42 dB over the first and the second Nyquist zone.The SFDR is above45 dB over the first Nyquist zone and the second Nyquist zone.The ERBW is about 1.4 GHz. 相似文献

8.

合成孔径雷达成像处理器数据传输与互连技术

张卫杰黄寅汤俊彭应宁《微计算机信息》2006,22(17):153-155

数据传输与互连技术是合成孔径雷达(SynthesisApertureRadar,简写为SAR)实时成像处理系统设计的关键技术之一。本文将当前的数据传输与互连技术分成基于网络、总线、交叉开关和专用技术等4类,分析了性能,并讨论了未来互连技术的发展方向。在此基础上,结合SAR信号处理的需求,提出了基于数据帧结构的通用串行分组交换数据传输技术,设计了物理层和链路层,并采用现场可编程门阵列(FieldProgrammableGateArray,简写为FPGA)完成了该技术的实现和测试。针对不同的SAR系统拓扑结构,分析了数据传输性能指标,结果表明,该技术能够完成SAR系统的高速数据传输和模块之间互连。相似文献

9.

VLIW处理器ISA建模与辅助软件优化技术

严迎建叶建森刘军伟徐劲松《计算机工程与设计》2009,30(11)

在基于VLIW结构的分组密码专用处理器设计过程中,研究了VLIW处理器的指令集体系结构建模技术.设计了一个指令精确的指令集模拟器,通过附加一个流水线相关及停顿统计模块,实现了周期精确的程序运行统计和流水线停顿统计.结合指令集模拟器、汇编器以及调试器,设计了一个面向VLIW处理器的辅助程序优化环境.利用模拟器和调试器来评估程序的指令级并行度以及资源占用情况,辅助程序开发者优化VLIW处理器程序,从而达到软硬件协作开发VLIW处理器指令级并行性的最终目的. 相似文献

10.

Designing an asynchronous processor using Petri nets

Semenov A. Koelmans A.M. Lloyd L. Yakovlev A. 《Micro, IEEE》1997,17(2):54-64

Using a simple example, we demonstrate how to design and analyze asynchronous systems from labeled Petri net specifications, later refining, transforming, and translating them for implementations 相似文献

11.

The function processor: A data-driven processor array for irregular computations

Jesper Vasell Jonas Vasell 《Future Generation Computer Systems》1992,8(4):321-335

相似文献

12.

LBM based flow simulation using GPU computing processor

Frédéric Kuznik Christian Obrecht Gilles Rusaouen Jean-Jacques Roux 《Computers & Mathematics with Applications》2010,59(7):2380-2392

Graphics Processing Units (GPUs), originally developed for computer games, now provide computational power for scientific applications. In this paper, we develop a general purpose Lattice Boltzmann code that runs entirely on a single GPU. The results show that: (1) simple precision floating point arithmetic is sufficient for LBM computation in comparison to double precision; (2) the implementation of LBM on GPUs allows us to achieve up to about one billion lattice update per second using single precision floating point; (3) GPUs provide an inexpensive alternative to large clusters for fluid dynamics prediction. 相似文献

13.

High-performance,low-cost analog data processing and control using a 16-bit single-board microcomputer

M. Huang M. Driels 《野外机器人技术杂志》1984,1(2):205-219

In this article, the implementation of a microcomputer-based control system using a low-cost 16-bit single-board microcomputer combined with a general-purpose data acquisition board is described. It is intended that this combination forms a suitable basis of a control system for the type of complex sensor systems currently associated with modern industrial robots. The technique of interfacing a 16-bit microcomputer to a peripheral environment which has essentially an 8-bit architecture is presented. The result is a high-performance, low-cost analog data processing and control system. A novel timing device with both software- and hardware-configurable features has been developed so that an accurate interval timer for use with sample data systems is also available. To demonstrate the potential use of the system, an illustrative example of real-time adaptive process control is described. 相似文献

14.

A pattern matching macro processor

Masataka Sassa 《Software》1979,9(6):439-456

A general-purpose pattern matching macro processor is described. Macro patterns can be defined using regular expressions. Macro calls are treated by balancing pattern matching at the token unit level, allowing options, alternatives and repetition. Thus, text in a language with a nested structure can be dealt with. In a macro body, Algol-style macro-time operations are allowed, which improves writing and reading. Our macro processor can also be used as a tool for language conversion since it incorporates a feature to declare language-dependent constructs such as comments, string notations and parentheses pairs. Although our macro processor is not biased towards any particular language, it has successfully converted an Algol 68-style text into a Fortran text. Problems of language conversion using macros are briefly discussed based upon the experience obtained through this macro processor. 相似文献

15.

A general-purpose fuzzy inference processor

Eichfeld H. Klimke M. Menke M. Nolles J. Kunemund T. 《Micro, IEEE》1995,15(3):12-17

The SAE 81C99 processor exhibits 4 different operation modes, 8 programmable fuzzy algorithms, and up to 256 inputs, 64 outputs, and 16,384 rules. The 1.0-μm CMOS chip serves as a stand-alone device or as an on-chip module for 8- or 16-bit microcontrollers. At 20-MHz crystal frequency and a maximum inference speed of 10 million rules/s, it supports very complex systems and millisecond (and faster) processes such as automotive electronics and pattern recognition 相似文献

16.

A general purpose,expandable processor for real-time computer graphics

Jeffrey F. Eastman David R. Wooten 《Computers & Graphics》1975,1(1):73-77

A class of critical computer requirements for real-time scan T.V. computer graphics is examined in relation to commercially available CPU architectures. Finding general purpose processors not suited, a new processor is proposed which is designed around the concept of ‘instruction set partitioning.’ In this design, special hardware-implemented algorithms may be included in the machine instruction set, and these processors allowed to operate asynchronously from each other. The design is projected to generate a complete new frame of a color T.V. picture every 0.1-0.8 s depending on image complexity. Due to its inherent generality, the CPU may be similarly expanded to encompass a wide variety of other specialized, or real-time tasks with minimal additional hardware. The 32-bit parallel processor has a design cycle time of 100 ns and is in the price class of a minicomputer. 相似文献

17.

基于嵌入式处理器软核Nios Ⅱ的IP复用技术

邓环环尹智勇彭建朝《微计算机信息》2007,23(8):6-7

本文阐述了Altera公司最新推出的嵌入式处理器软核Nios Ⅱ及其应用系统的开发过程,并结合汽车行驶记录仪的设计给出了一个基于Nios Ⅱ嵌入式处理器的应用实例。文章对现代数字系统设计中的IP复用方法进行了详细的介绍。相似文献

18.

Design of microprogrammed floating processor using superslice Am2903

Gunneswara Rao 《Microprocessors and Microsystems》1983,7(4):155-162

In this article, the basic principles of designing a floating point processor made up of the bit-slice Am2903, which can be integrated into a 16-bit central processor as a subsystem, are discussed. Floating point processor microprogramming and microflows of an instruction are also discussed 相似文献

19.

使用嵌入式处理器对可编程逻辑器件重编程

吴进《微计算机信息》2008,24(5):59-61

"在系统可重编程能力"指的是在可编程逻辑器件焊接到印制电路板上之后还可以对其重编程的一种特性.本文给出了如何通过嵌入式处理器使用Jam语言对具备"在系统可重编程能力"的可编程逻辑器件重编程的方法,包括软件方面和硬件方面的考虑以及内存的使用情况,解决了在产品原型及制造阶段由于条件所限无法通过下载电缆对可编程逻辑器件重编程的问题. 相似文献

20.

A programmable analog neural network processor

Fisher W.A. Fujimoto R.J. Smithson R.C. 《Neural Networks, IEEE Transactions on》1991,2(2):222-229

An analog neural network breadboard consisting of 256 neurons and 2048 programmable synaptic weights of 5 bits each is constructed and tested. The heart of the processor is an array of custom-programmable synapse (resistor) chips on a reconfigurable neuron board. The analog bandwidth of the system is 90 kHz. The breadboard is used to demonstrate the application of neural network learning to the problem of real-time adaptive mirror control. The processor control is 21 actuators of an adaptive mirror with a step-response setting time of 5 ms. The demonstration verified that it is possible to modify the control law of the high-speed analog loop using neural network training without stopping the control loop. 相似文献