期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

胡正伟仲顺安《计算机工程》2007,33(22):23-25

为了实现不同数制的乘法共享硬件资源，提出了一种可以实现基于IEEE754标准的64位双精度浮点与32位单精度浮点、32位整数和16位定点的多功能阵列乘法器的设计方法。采用超前进位加法和流水线技术实现乘法器性能的提高。设计了与TMS320C6701乘法指令兼容的乘法单元，仿真结果验证了设计方案的正确性。相似文献

2.

面向电压降的忆阻神经网络精度优化EI北大核心CSCD

王超查晓婧夏银水《计算机辅助设计与图形学学报》2023,(4):633-639

由于忆阻器交叉阵列自身的模拟特性可高效实现乘累加运算,因此,它被广泛用于构建神经形态计算系统的硬件加速器.然而,纳米线电阻的存在,会引起忆阻器与纳米线构成的电阻网络出现电压降问题,导致忆阻器阵列的输出信号损失而影响神经网络的精度.分析忆阻器电压降与忆阻器状态、位置,输出电流和输出位置的关系,通过稀疏映射优化电压降,并采用输出补偿进一步提高输出精度.仿真实验的结果表明,该方法可以有效地解决电压降引起的问题,忆阻神经网络在手写数字数据集MNIST的识别率达到95.8%,较优化前提升了33.5%. 相似文献

3.

整体的与模块的补码、单元阵列乘法器

Kai Hwang 林定基《计算机研究与发展》1980,(4)

本文根据Baugh—Wooley算法提出了实现补码乘法的二种LSI叠接逻辑阵列的新系列。整体的方法速度比较快,对大规模集成很有吸引力,但目前在尺寸上仍受到单片和封装工艺的限制。模块的方法能较好地适合于实现任意大的阵列乘法器,只是在速度上稍有降低。本文提出的带加法的乘法模块可以通过硬接线的方法,从外部编程序来实现一个二进制数的相乘,这二个数可以是补码,也可以是不带符号的格式。在构成本文所提出的模块乘法网络时,并不需外围逻辑电路,如象Wallace树或求补码器。此外,还讨论了速度分析、硬件复杂性、封装以及对该阵列乘法器的使用要求。相似文献

4.

基于组间进位预测的快速进位加法器

下载免费PDF全文

丁宜栋刘昌明方湘艳《计算机工程》2011,37(23):288-290

为加快密码系统中大数加法的运算速度,提出并实现一种基于组间进位预测的快速进位加法器。将参与加法运算的大数进行分组,每个分组采用改进的超前进位技术以减少组内进位延时,组间通过进位预测完成不同进位状态下的加法运算,通过每个组产生的进位状态判断最终结果。性能分析表明,该进位加法器实现1 024位大数加法运算的速度较快。相似文献

5.

面向三维忆阻阵列的状态逻辑计算

胡钇宏马德胜许诺王文清黄成龙方粮《计算机工程与科学》2023,(3):381-389

基于忆阻存储阵列的状态逻辑电路是打破“冯·诺依曼瓶颈”,实现存内计算的有效途径。然而，目前针对存内状态逻辑电路的研究多以二维忆阻存储阵列为基础平台，缺少对更复杂的三维忆阻存储阵列中状态逻辑实现的讨论。相比于平面二维阵列，三维忆阻存储阵列拥有更大的存储密度和更丰富的器件连接关系，能对状态逻辑门的构建提供更灵活的配型方法。因此，有必要对状态逻辑门在三维存储阵列中的配型和级联过程进行专门讨论。立足平面堆叠型三维忆阻存储阵列，从基本状态逻辑门的实现以及支持级联的综合映射方法2个方面对复杂状态逻辑计算过程实现进行研究。首先，分析并总结了平面堆叠型三维忆阻存储阵列中器件的连接关系，并据此得出实现两输入布尔逻辑的状态逻辑门配型要求。其次，提出一种复合状态逻辑门，通过将逻辑输入与逻辑输出共享同一个忆阻器，来一步实现复杂逻辑功能(例如，定义为ONOR),节省复杂状态逻辑计算过程的步骤与器件数目。最后，还给出了基于三维忆阻存储阵列中复杂状态逻辑计算实现的自动化综合映射方法。对LGsynth91基准的测试结果表明，与当前二维阵列中的最优映射结果相比，提出的基于三维忆阻存储阵列的综合映射方法实现了层间的逻辑计... 相似文献

6.

基于忆阻器边缘计算的图像分类电路设计

罗佳冉欢欢何凯霖丁晓峰《控制与决策》2022,37(9):2353-2359

针对边缘智能设备低功耗、轻算力的要求,采用新型存算一体器件—–忆阻器作为基础电路元件,设计低功耗图像识别电路.该电路采用多个忆阻卷积层和忆阻全连接网络串联的方式,获得较高的识别精度.为了减小忆阻卷积层计算所需的忆阻交叉阵列的行尺寸与列尺寸的不平衡,同时降低输入电压方向电路的功耗,将输入电压反相器置于忆阻交叉阵列之后.所设计电路可以将完成忆阻卷积网络运算所需的忆阻交叉阵列的行大小从2M+1减少至M+1,同时将单个卷积核计算所需的反相器的数量降至1,大幅度降低忆阻卷积网络的体积和功耗.利用数学近似,将BN层和dropout层计算合并到CNN层中,减小网络层数同时降低电路的功耗.通过在CIFAR-10数据集上的实验表明,所设计电路可以有效地对图像进行分类,同时具备推理速度快(136 ns)和功耗低的优点(单个神经元功耗小于3.5 uW). 相似文献

7.

基于SIMD部件的四倍精度浮点乘加器设计

何军黄永勤朱英《计算机科学》2013,40(12):15-18,51

如何减少四倍精度浮点运算的硬件开销和延迟是需要解决的重要问题。为减少四倍精度乘加器的硬件开销,基于支持64位×4的双精度浮点SIMD FMA部件,设计并实现了一种新的四倍精度浮点乘加器(QPFMA),来支持4种浮点乘加运算和乘法、加减法、比较运算,运算延迟为7拍。通过将四倍精度113位×113位尾数乘法器分解为4个57位×57位乘法器来共享双精度浮点SIMD FMA部件的53位×53位乘法器,显著减少了实现QPFMA的硬件开销。基于65nm工艺的逻辑综合结果表明,该QPFMA频率可达1.1GHz,面积是常规QPFMA设计的42.71％,仅与一个双精度浮点乘加器相当。与现有的QPFMA设计相比,相当工艺和频率下,其运算延迟减少了3拍,门数减少了65.96％。相似文献

8.

忆阻器交叉阵列及在图像处理中的应用 总被引：2，自引：0，他引：2

胡小方段书凯王丽丹廖晓峰《中国科学:信息科学》2011,(4):500-512

忆阻器是一种有记忆功能的非线性电阻,其阻值的变化依赖于流过它的电荷数量或磁通量.忆阻器作为第4个基本的电路元件,在众多领域中有巨大的应用潜力,有望推动整个电路理论的变革.文中利用数值仿真和电路建模,分析了忆阻器的理论基础和特性,提出了一种用于图像存储的忆阻器交叉阵列,可以实现黑白、灰度和彩色图像的存储和输出,一系列的计... 相似文献

9.

基于四叉树的高速乘法器算法研究

刘磊严晓浪孟建熠葛海通《计算机应用研究》2010,27(10):3727-3730

提出了一种基于四叉树结构的高速乘法器自动综合优化算法以提升乘法器运算速度。首先对延时较大的高位积采用四叉树递归直接构建,取代传统部分积进位链,缩短关键路径时延,进而进行分支折合和合并,相邻乘法结果共享部分四叉树,降低硬件开销。算法同时支持不同面积约束下的自动综合。依此算法的乘法器相比基于Booth算法和Wallace树的乘法器速度提高了10%。相似文献

10.

基于DNA自动机的串行二进制进位加法的实现

李汪根丁永生《计算机科学》2006,33(7):167-170

提出了一种基于DNA自动机的串行二进制进位加法的实现方法。对于一位二进制的进位加法，通过预先设计的DNA自动机模型在一个试管中以自动机的方式完成。对于”位二进制的进位加法，通过将n个类似的试管按照从低位到高位的顺序组成串行网络；将低位加法操作产生的进位转移到高位试管，组成高位自动机的输入符号串，完成高位的加法操作。这种运算方式类似于电子计算机中加法运算系统，为DNA计算机实现算术运算提供了一种新颖的方法。相似文献

11.

Design and implementation of low power and high speed multiplier using quaternary carry look-ahead adder

《Microprocessors and Microsystems》2020

Need of Digital Signal Processing (DSP) systems which is embedded and portable has been increasing as a result of the speed growth of semiconductor technology. Multiplier is a most crucial part in almost every DSP application. So, the low power, high speed multipliers is needed for high speed DSP. Array multiplier is one of the fast multiplier because it has regular structure and it can be designed very easily. Array multiplier is used for multiplication of unsigned numbers by using full adders and half adders. It depends on the previous computations of partial sum to produce the final output. Hence, delay is more to produce the output. In the previous work, Complementary Metal Oxide Semiconductor (CMOS) Carry Look-ahead Adders (CLA) and CMOS power gating based CLA are used for maximizing the speed of the multiplier and to improve the power dissipation with minimum delay. CMOS logic is based on radix 2(binary) number system. In arithmetic operation, major issue corresponds to carry in binary number system. Higher radix number system like Quaternary Signed Digit (QSD) can be used for performing arithmetic operations without carry. The proposed system designed an array multiplier with Quaternary Signed Digit number system (QSD) based Carry Look-Ahead Adder (CLA) to improve the performance. Generally, the quaternary devices require simpler circuit to process same amount of data than that needed in binary logic devices. Hence the Quaternary logic is applied in the CLA to improve the speed of adder and high throughput. In array multiplier architecture, instead of full adders, carry look-ahead adder based on QSD are used. This facilitates low consumption of power and quick multiplication. Tanner EDA tool is used for simulating the proposed multiplier circuit in 180 nm technology. With respect to area, Power Delay Product (PDP), Average power proposed QSD CLA multiplier is compared with Power gating CLA and CLA multiplier. 相似文献

12.

Optimization of speeded-up robust feature algorithm for hardware implementation

ShanShan Cai LeiBo Liu ShouYi Yin RenYan Zhou WeiLong Zhang ShaoJun Wei 《中国科学:信息科学(英文版)》2014,57(4):1-15

相似文献

13.

Analog memristive memory with applications in audio signal processing

DUAN ShuKai HU XiaoFang WANG LiDan LI ChuanDong 《中国科学:信息科学(英文版)》2014,(4):239-253

Since the development of the HP memristor, much attention has been paid to studies of memris- tive devices and applications, particularly memristor-based nonvolatile semiconductor memory. Owing to its unique properties, theoretically, one could restart a memristor-based computer immediately without the need for reloading the data. Further, current memories are mainly binary and can store only ones and zeros, whereas memristors have multilevel states, which means a single memristor unit can replace many binary transistors and realize higher-density memory. It is believed that memristors can also implement analog storage besides binary and multilevel information memory. In this paper, an implementation scheme for analog memristive memory is considered. A charge-controlled memristor model is derived and the corresponding SPICE model is constructed. Special write and read operations are demonstrated through numerical analysis and circuit simulations. In addition, an audio analog record/play system using a memristor crossbar array is designed. This system can provide great storage capacity （long recording time） and high audio quality with a simple small circuit structure. A series of computer simulations and analyses verify the effectiveness of the proposed scheme. 相似文献

14.

忆阻器状态逻辑中与操作的高效设计与实现

张娜吴俊杰黄达刘福东周海芳《计算机研究与发展》2012,(Z1):73-78

作为电阻、电容、电感之外的第4种基本电路元件,忆阻器自2008年被发现以来受到学术界和产业界的广泛关注.忆阻器的阻值记忆效应和纳米工艺制造方式使其被认为可用于构建未来更大容量和密度的存储器,逐渐替代FLASH等现有存储器件.除存储功能外,HP公司在2010年《Nature》上发表的文章表明,忆阻器还可以通过以蕴含为基础的状态逻辑实现任意逻辑运算.研究了忆阻器状态逻辑的另一种操作——与操作,提出了一种更加高效的与操作实现方法,该方法不需要增加额外的忆阻器,降低了激励电压的复杂性,减小了误差,使运算更加简便高效.最后通过SPICE模拟仿真对提出的方法进行了验证. 相似文献

15.

Design of low power multi-ternary digit multiplier in CNTFET technology

《Microprocessors and Microsystems》2020

This work introduces the method to implement energy efficient designs of arithmetic units such as a ternary full adder, ripple carry adder, single-trit multiplier and multi-trit multiplier using carbon nanotube field effect transistors (CNTFETs). A CNTFET unique feature of the threshold voltage variation by changing the CNT diameter, make it a suitable alternative for being employed in ternary logic designs. In designing the proposed circuits, decoder circuit functionality is realized by various threshold detector circuits tuned to a specific logical threshold voltage value. The multiplier circuit is designed by combing the capacitive logic and the minority function. In order to test the practicability of proposed circuits in cascaded circuits, multi-digit adder and multiplier circuits are constructed. The proposed multi-digit multiplier structure is based on classical Wallace multiplier and includes various optimized versions of adder and multiplier circuits. Extensive simulation has been done to examine the competency of proposed designs under different test conditions. The design of 3-trit multiplier formed by combing the proposed adder and multiplier circuits shows 16 times reduction in power consumption as well as energy consumption in comparison to previous multiplier design. 相似文献

16.

Algorithm and Implementation of Parallel Multiplication in a Mixed Number System

下载免费PDF全文

Luo Yinfang 《计算机科学技术学报》1988,3(3):203-213

This paper presents a high-speed multiplication algorithm for the mixed number system of the ordinarybinary number and the symmetric redundant binary number.It is implemented with the multivalned logictheory,and 3-valued and 2-valued circuits are used.The 3-valued circuit proposed in this paper is anemitter-coupled logic circuit with high speed,simplicity and powerful functions.A 3-valued ECL thresholdgate can simultaneously produce six types of one-variable operations.The array multiplier,designed withthe algorithm and the circuits,is fast and simple,and is suitable for building LSI.It can be used in a high-speed computer just as an ordinary binary multiplier. 相似文献

17.

Memristive crossbar array with applications in image processing

HU XiaoFang DUAN ShuKai WANG LiDan & LIAO XiaoFeng 《中国科学:信息科学(英文版)》2012,(2):461-472

A memristor is a kind of nonlinear resistor with memory capacity.Its resistance changes with the amount of charge or flux passing through it.As the fourth fundamental circuit element,it has huge potential applications in many fields,and has been expected to drive a revolution in circuit theory.Through numerical simulations and circuitry modeling,the basic theory and properties of memristors are analyzed,and a memristor-based crossbar array is then proposed.The array can realize storage and output for binary,grayscale and color images.A series of computer simulations demonstrates the effectiveness of the proposed scheme.Owing to the advantage of the memristive crossbar array in parallel information processing,the proposed method is expected to be used in high-speed image processing. 相似文献

18.

GF(2~m)域上通用可配置乘法器的设计与实现

卫学陶戴紫彬陈韬《计算机工程与应用》2007,43(12):91-93

提出了一种应用于椭圆曲线密码体制中的有限域乘法器结构,基于已有的digit-serial结构乘法器,利用局部并行的bit-parallel结构,有效地省去了模约简电路,使得乘法器适用于任意不可约多项式;通过使用数据接口控制输入数据的格式并内嵌大尺寸乘法器,可以配置有限域乘法器的结构,用以实现基于多项式基的有限域乘法运算。该结构可以有效满足椭圆曲线密码体制的不同安全需求。相似文献

19.

VLSI design of APT-VDF using novel variable block sized ternary adder and multiplier

《Microprocessors and Microsystems》2020

Nowadays, Variable digital filters (VDF) play an essential role in the field of communication and signal processing. The desired frequency response of any prototype filter can be obtained by developing an All Pass Transformation (APT) based Variable digital filter (APT-VDF) that maintains an exhaustive control over the cut off frequency. The performance of the APT-VDF is limited by its speed and area utilization. In this paper, the pipelined APT-VDF is modified by developing a new Variable Block Sized Ternary Adder (VBS-TA) and a modified Ternary multiplier for the fast realization of the filter structure. Because, the fundamental arithmetic operations involved in the design of APT-VDF are addition and multiplication. The ternary logic transmits more data through interconnection wire, and hence the ternary logic based arithmetic requires fewer components and interconnections. The proposed VBS-TA increases the speed of the addition process by skipping the carry propagation with the help of ternary compound gates. This VBS-TA can also be used to boost up the speed of the multiplier circuit in the APT-VDF filter. Furthermore, the ternary multiplier is modified by introducing a divide and conquers approach in the partial product generation part. The simulation results show that the proposed APT-VDF overtakes the existing VDFs in terms of delay, power and area utilization. It consumes only 0.289Wpower with a latency of 9.24 ns. Also, it achieves an operating frequency of 210.87 MHz, and it is much better than the existing VDFs. 相似文献