共查询到15条相似文献,搜索用时 703 毫秒
1.
介绍了一种适用于Viterbi解码器的异步ACS(加法器比较器选择器)的设计.它采用异步握手信号取代了同步电路中的整体时钟.给出了一种异步实现结构的异步加法单元、异步比较单元和异步选择单元电路.采用全定制设计方法设计了一个异步4 -bit ACS,并通过0 .6μm CMOS工艺进行投片验证.经过测试,芯片在工作电压5V,工作频率20MHz时的功耗为75. 5mW.由于采用异步控制,芯片在“睡眠”状态待机时不消耗动态功耗.芯片的平均响应时间为19 .18ns,仅为最差响应时间23 .37ns的82%.通过与相同工艺下的同步4 -bit ACS在功耗和性能方面仿真结果的比较,可见异步ACS较同步ACS具有优势. 相似文献
2.
异步集成电路标准单元的设计与实现 总被引:1,自引:1,他引:0
设计异步集成电路时,常用的异步标准单元的分类、电路设计方法和电路结构.详细介绍了C单元和异步数据通路的设计与实现,提出了一种异步实现结构的异步加法单元、异步比较单元和异步选择单元电路.利用设计的异步标准单元构成了一个适用于Viterbi解码器的异步ACS(加法器一比较器一选择器),并通过0.6μmCMOS工艺进行投片验证.当芯片工作电压为5V,工作频率为20MHz时的功耗为75.5mW.芯片的平均响应时问为19.18DS,仅为最差响应时间23.37ns的82%.从而验证了异步标准单元的正确性和异步电路在性能方面较同步电路存在的优势. 相似文献
3.
介绍一种新型异步 ACS(加法器 -比较器 -选择器 )的设计。一种异步实现结构的异步比较器 ,并通过异步加法单元、比较单元和选择单元的异步互连 ,构成了异步 ACS。在异步 ACS的性能分析时采用了一种基于多延迟模型的新方法 ,建立了异步加法器和比较器的多延迟模型 ,通过逻辑仿真 ,得到异步 ACS的平均响应时间为 3 .66ns,最长响应时间为 8.1 ns。由此可见 ,异步 ACS在性能方面较同步 ACS存在优势。 相似文献
4.
本文设计了异步LDPC解码器运算通路,利用异步电路减少信号到达时间不一致引起的毛刺和时钟引起的功耗.利用输入数据的统计特性设计了运算通路中的主要运算单元,减少了冗余运算.本文还实现了同步运算通路和基于门控时钟的运算通路作为比较.三种设计采用相近的架构,在0.18μm CMOS工艺下实现相同的功能.仿真结果表明,提出的异步设计功耗最小,相比于同步设计和基于门控时钟设计,分别节省了42.0%和32.6%的功耗.虽然性能稍逊于同步设计,但优于门控时钟设计.其中,同步设计的延时是1.09ns,基于门控时钟的设计延时是1.61ns,而异步设计则是1.20ns. 相似文献
5.
本文利用输入数据的统计特性,设计了两种低功耗异步比较器——异步行波比较器和提前终止异步比较器.异步行波比较器从第一个不相等的数位开始停止运算,但要把结果传到最低位,消耗部分功耗.提前终止异步比较器通过修改真值表,基于新的比较单元电路和终止判断电路,在第一个不相等的数位停止运算并立即输出比较结果,节省不必要的功耗.新设计的异步比较器和用于对比的同步比较器(BCL比较器和门控时钟比较器)均用SMIC0.18μm工艺实现.仿真结果表明,提前终止异步比较器功耗最低,与同步BCL比较器和门控时钟比较器相比,在随机数据和来自LDPC解码器的数据下,分别节省了87.1%、84.5%和37.5%、28.6%的功耗. 相似文献
6.
7.
8.
9.
10.
11.
Myeong‐Hoon Oh Young Woo Kim Sanghoon Kwak Chi‐Hoon Shin Sung‐Nam Kim 《ETRI Journal》2013,35(3):480-490
As technology evolves into the deep submicron level, synchronous circuit designs based on a single global clock have incurred problems in such areas as timing closure and power consumption. An asynchronous circuit design methodology is one of the strong candidates to solve such problems. To verify the feasibility and efficiency of a large‐scale asynchronous circuit, we design a fully clockless 32‐bit processor. We model the processor using an asynchronous HDL and synthesize it using a tool specialized for asynchronous circuits with a top‐down design approach. In this paper, two microarchitectures, basic and enhanced, are explored. The results from a pre‐layout simulation utilizing 0.13‐μm CMOS technology show that the performance and power consumption of the enhanced microarchitecture are respectively improved by 109% and 30% with respect to the basic architecture. Furthermore, the measured power efficiency is about 238 μW/MHz and is comparable to that of a synchronous counterpart. 相似文献
12.
AMULET2e: an asynchronous embedded controller 总被引:5,自引:0,他引:5
Furber S.B. Garside J.D. Riocreux P. Temple S. Day P. Jianwei Liu Paver N.C. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1999,87(2):243-256
AMULET2e is an embedded system chip incorporating a 32-bit ARM-compatible asynchronous processor core, a 4-Kb pipelined cache, a flexible memory interface with dynamic bus sizing, and assorted programmable control functions. Many on-chip performance-enhancing and power-saving features are switchable, enabling detailed experimental analysis of their effectiveness. AMULET2e silicon demonstrates competitive performance and power efficiency, ease of system design, and it includes innovative features that exploit its asynchronous operation to advantage in applications that require low standby power and/or freedom from the electromagnetic interference generated by system clocks 相似文献
13.
This paper presents a novel variable-latency multiplier architecture, suitable for implementation as a self-timed multiplier core or as a fully synchronous multicycle multiplier core. The architecture combines a second-order Booth algorithm with a split carry save array pipelined organization, incorporating multiple row skipping and completion-predicting carry-select dual adder. The paper reports the architecture and logic design, CMOS circuit design and performance evaluation. In 0.35 μm CMOS, the expected sustainable cycle time for a 32-bit synchronous implementation is 2.25 ns. Instruction level simulations estimate 54% single-cycle and 46% two-cycle operations in SPEC95 execution. Using the same CMOS process, the 32-bit asynchronous implementation is expected to reach an average 1.76 ns throughput and 3.48 ns latency in SPEC95 execution 相似文献
14.
Benschneider B.J. Bowhill W.J. Copper E.M. Gavrielov M.N. Gronowski P.E. Maheshwari V.K. Peng V. Pickholtz J.D. Samudrala S. 《Solid-State Circuits, IEEE Journal of》1989,24(5):1317-1323
A 135K transistor, uniformly pipelined 50-MHz CMOS 64-bit floating-point arithmetic processor chip is described. The execution unit is capable of sustaining pipelined performance of one 32-bit or 64-bit result every 20 ns for all operations except double-precision multiply (40 ns) and divide. The chip employs an exponent difference prediction scheme and a unified leading-one and sticky-bit computation logic for the addition and subtraction operations. A hardware multiplier using a radix-8 modified Booth algorithm and a divider using a radix-2 SRT algorithm are employed.<> 相似文献
15.
Hwa-Joon Oh Mueller S.M. Jacobi C. Tran K.D. Cottier S.R. Michael B.W. Nishikawa H. Totsuka Y. Namatame T. Yano N. Machida T. Dhong S.H. 《Solid-State Circuits, IEEE Journal of》2006,41(4):759-771
The floating-point unit (FPU) in the synergistic processor element (SPE) of a CELL processor is a fully pipelined 4-way single-instruction multiple-data (SIMD) unit designed to accelerate media and data streaming with 128-bit operands. It supports 32-bit single-precision floating-point and 16-bit integer operands with two different latencies, six-cycle and seven-cycle, with 11 FO4 delay per stage. The FPU optimizes the performance of critical single-precision multiply-add operations. Since exact rounding, exceptions, and de-norm number handling are not important to multimedia applications, IEEE correctness on the single-precision floating-point numbers is sacrificed for performance and simple design. It employs fine-grained clock gating for power saving. The design has 768K transistors in 1.3 mm/sup 2/, fabricated SOI in 90-nm technology. Correct operations have been observed up to 5.6 GHz with 1.4 V and 56/spl deg/C, delivering 44.8 GFlops. Architecture, logic, circuits, and integration are codesigned to meet the performance, power, and area goals. 相似文献