共查询到20条相似文献,搜索用时 0 毫秒
1.
All modern computational devices consist of ALU. With increase in complexity of software and the consistent shift of software towards parallelism, high speed processors with hardware support for time consuming operations such as multiplication would benefit. Smaller, compact devices such as IoT devices need to run software such as security software and be able to offload computation cost from the cloud. In this paper, a high speed 8-bit ALU using 18 nm FinFET technology is proposed. The arithmetic and logical unit consists of fast compute units such as Kogge Stone fast adder and Dadda multiplier along with basic logic gates. In this paper, an ALU with each compute unit optimized for speed is proposed, while responsibly consuming area. Dadda multiplier is of 8 × 8 architecture as opposed to conventional approach of 4 × 4 making it a true 8-bit ALU. Simulation and analysis is done using Cadence Virtuoso in Analog Design Environment. The transistor count of proposed design is 5298, the power consumption is 219 µW and maximum delay is 166.8 ps. The design is also expected to consume a maximum of one clock cycle for any computation. 相似文献
2.
本文介绍一种用于高性能DSP的32位浮点乘法器设计,通过采用改进Booth编码的树状4-2压缩器结构,提高了速度,降低了功耗,该乘法器结构规则且适合于VLSI实现,单个周期内完成一次24位整数乘或者32位浮点乘。整个设计采用Verilog HDL语言结构级描述,用0.25um单元库进行逻辑综合.完成一次乘法运算时间为24.30ns. 相似文献
3.
本文介绍一种用于高性能DSP的32位浮点乘法器设计,通过采用改进Booth编码的树状4-2压缩器结构,提高了速度,降低了功耗,该乘法器结构规则且适合于VLSI实现,单个周期内完成一次24位整数乘或者32位浮点乘。整个设计采用Verilog HDL语言结构级描述,用0.25um单元库进行逻辑综合.完成一次乘法运算时间为24.30ns. 相似文献
4.
异步处理器解决了传统的同步处理器时钟偏移的问题,具有低功耗和高并行性等优点。本文着重分析了设计异步处理器的关键技术及实现方法,分析比较了当前异步处理器的实现方式,指出了异步处理器的研究方向和重点。并展望了异步处理器技术在媒体处理领域中的应用。异步处理器虽然还没有得到实际的广泛应用,但具有很高的研究价值。 相似文献
5.
This paper describes a real-time vision system (RVS) architecture and performance and its use of an integrated memory array processor (IMAP) prototype. This prototype integrates eight 8-bit processors and a 144-kbit SRAM on a single chip. The RVS was developed with 64 IMAP prototypes connected in series in a 512 processor-system configuration. A host workstation can access the memory on the IMAP prototypes directly through a random access port. Images are inputted and outputted at high speed through serial access ports. The RVS performance is shown in real-time road-image processing and in a neural network simulation, as well as in low-level image processing algorithms, such as filtering, histograms, discrete cosine transform (DCT), and rotation. The RVS image processing is shown to be much faster than the video rate. 相似文献
6.
This paper presents an effective algorithm, interactive 1-bit feedback segmentation using transductive inference (FSTI), that interactively reasons out image segmentation. In each round of interaction, FSTI queries the user one superpixel for acquiring 1-bit user feedback to define the label of that superpixel. The labeled superpixels collected so far are used to refine the segmentation and generate the next query. The key insight is treating the interactive segmentation as a transductive inference problem, and then suppressing the unnecessary queries via an intrinsic-graph-structure derived from transductive inference. The experiments conducted on five publicly available datasets show that selecting query superpixels concerning the intrinsic-graph-structure is helpful to improve the segmentation accuracy. In addition, an efficient boundary refinement is presented to improve segmentation quality by revising the misaligned boundaries of superpixels. The proposed FSTI algorithm provides a superior solution to the interactive image segmentation problem is evident. 相似文献
7.
Ultra high speed and moderate resolution ADCs with low latency are demanded in many applications.A 4-GS/s 8-bit ADC is implemented in the 0.35μm SiGe BiCMOS technology.It is based on the two-channel time-interleaved architecture and each sub-ADC employs the two-stage cascaded folding and interpolating topology which guarantees the low-latency property.Calibration circuits are introduced to compensate for the mismatch between the two sub-ADCs.The whole chip area is about 4.0×4.0(mm2).The ADC exhibits DNL of 0.26/0.34 LSB and INL of 0.96/0.92 LSB.The ENOB is 7.1 bits and the SFDR is about 56 dB at10.1 MHz input.The SNDR is above 42 dB over the first and the second Nyquist zone.The SFDR is above45 dB over the first Nyquist zone and the second Nyquist zone.The ERBW is about 1.4 GHz. 相似文献
8.
数据传输与互连技术是合成孔径雷达(SynthesisApertureRadar,简写为SAR)实时成像处理系统设计的关键技术之一。本文将当前的数据传输与互连技术分成基于网络、总线、交叉开关和专用技术等4类,分析了性能,并讨论了未来互连技术的发展方向。在此基础上,结合SAR信号处理的需求,提出了基于数据帧结构的通用串行分组交换数据传输技术,设计了物理层和链路层,并采用现场可编程门阵列(FieldProgrammableGateArray,简写为FPGA)完成了该技术的实现和测试。针对不同的SAR系统拓扑结构,分析了数据传输性能指标,结果表明,该技术能够完成SAR系统的高速数据传输和模块之间互连。 相似文献
9.
在基于VLIW结构的分组密码专用处理器设计过程中,研究了VLIW处理器的指令集体系结构建模技术.设计了一个指令精确的指令集模拟器,通过附加一个流水线相关及停顿统计模块,实现了周期精确的程序运行统计和流水线停顿统计.结合指令集模拟器、汇编器以及调试器,设计了一个面向VLIW处理器的辅助程序优化环境.利用模拟器和调试器来评估程序的指令级并行度以及资源占用情况,辅助程序开发者优化VLIW处理器程序,从而达到软硬件协作开发VLIW处理器指令级并行性的最终目的. 相似文献
10.
Using a simple example, we demonstrate how to design and analyze asynchronous systems from labeled Petri net specifications, later refining, transforming, and translating them for implementations 相似文献
12.
Graphics Processing Units (GPUs), originally developed for computer games, now provide computational power for scientific applications. In this paper, we develop a general purpose Lattice Boltzmann code that runs entirely on a single GPU. The results show that: (1) simple precision floating point arithmetic is sufficient for LBM computation in comparison to double precision; (2) the implementation of LBM on GPUs allows us to achieve up to about one billion lattice update per second using single precision floating point; (3) GPUs provide an inexpensive alternative to large clusters for fluid dynamics prediction. 相似文献
13.
In this article, the implementation of a microcomputer-based control system using a low-cost 16-bit single-board microcomputer combined with a general-purpose data acquisition board is described. It is intended that this combination forms a suitable basis of a control system for the type of complex sensor systems currently associated with modern industrial robots. The technique of interfacing a 16-bit microcomputer to a peripheral environment which has essentially an 8-bit architecture is presented. The result is a high-performance, low-cost analog data processing and control system. A novel timing device with both software- and hardware-configurable features has been developed so that an accurate interval timer for use with sample data systems is also available. To demonstrate the potential use of the system, an illustrative example of real-time adaptive process control is described. 相似文献
14.
A general-purpose pattern matching macro processor is described. Macro patterns can be defined using regular expressions. Macro calls are treated by balancing pattern matching at the token unit level, allowing options, alternatives and repetition. Thus, text in a language with a nested structure can be dealt with. In a macro body, Algol-style macro-time operations are allowed, which improves writing and reading. Our macro processor can also be used as a tool for language conversion since it incorporates a feature to declare language-dependent constructs such as comments, string notations and parentheses pairs. Although our macro processor is not biased towards any particular language, it has successfully converted an Algol 68-style text into a Fortran text. Problems of language conversion using macros are briefly discussed based upon the experience obtained through this macro processor. 相似文献
15.
The SAE 81C99 processor exhibits 4 different operation modes, 8 programmable fuzzy algorithms, and up to 256 inputs, 64 outputs, and 16,384 rules. The 1.0-μm CMOS chip serves as a stand-alone device or as an on-chip module for 8- or 16-bit microcontrollers. At 20-MHz crystal frequency and a maximum inference speed of 10 million rules/s, it supports very complex systems and millisecond (and faster) processes such as automotive electronics and pattern recognition 相似文献
16.
A class of critical computer requirements for real-time scan T.V. computer graphics is examined in relation to commercially available CPU architectures. Finding general purpose processors not suited, a new processor is proposed which is designed around the concept of ‘instruction set partitioning.’ In this design, special hardware-implemented algorithms may be included in the machine instruction set, and these processors allowed to operate asynchronously from each other. The design is projected to generate a complete new frame of a color T.V. picture every 0.1-0.8 s depending on image complexity. Due to its inherent generality, the CPU may be similarly expanded to encompass a wide variety of other specialized, or real-time tasks with minimal additional hardware. The 32-bit parallel processor has a design cycle time of 100 ns and is in the price class of a minicomputer. 相似文献
17.
本文阐述了Altera公司最新推出的嵌入式处理器软核Nios Ⅱ及其应用系统的开发过程,并结合汽车行驶记录仪的设计给出了一个基于Nios Ⅱ嵌入式处理器的应用实例。文章对现代数字系统设计中的IP复用方法进行了详细的介绍。 相似文献
18.
In this article, the basic principles of designing a floating point processor made up of the bit-slice Am2903, which can be integrated into a 16-bit central processor as a subsystem, are discussed. Floating point processor microprogramming and microflows of an instruction are also discussed 相似文献
19.
"在系统可重编程能力"指的是在可编程逻辑器件焊接到印制电路板上之后还可以对其重编程的一种特性.本文给出了如何通过嵌入式处理器使用Jam语言对具备"在系统可重编程能力"的可编程逻辑器件重编程的方法,包括软件方面和硬件方面的考虑以及内存的使用情况,解决了在产品原型及制造阶段由于条件所限无法通过下载电缆对可编程逻辑器件重编程的问题. 相似文献
20.
An analog neural network breadboard consisting of 256 neurons and 2048 programmable synaptic weights of 5 bits each is constructed and tested. The heart of the processor is an array of custom-programmable synapse (resistor) chips on a reconfigurable neuron board. The analog bandwidth of the system is 90 kHz. The breadboard is used to demonstrate the application of neural network learning to the problem of real-time adaptive mirror control. The processor control is 21 actuators of an adaptive mirror with a step-response setting time of 5 ms. The demonstration verified that it is possible to modify the control law of the high-speed analog loop using neural network training without stopping the control loop. 相似文献
|