首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 93 毫秒
1.
浮点乘法器中IEEE舍入的实现   总被引:1,自引:0,他引:1  
描述了浮点乘法器中舍入的基本方法,介绍了一种实现舍入的系统的设计方法和硬件模型,并对它进行了分析,在这种系统设计方法的基础上,提出了一种直接预测和选择的舍入方案。  相似文献   

2.
为了实现不同数制的乘法共享硬件资源,提出了一种可以实现基于IEEE754标准的64位双精度浮点与32位单精度浮点、32位整数和16位定点的多功能阵列乘法器的设计方法。采用超前进位加法和流水线技术实现乘法器性能的提高。设计了与TMS320C6701乘法指令兼容的乘法单元,仿真结果验证了设计方案的正确性。  相似文献   

3.
彭元喜  杨洪杰  谢刚 《计算机应用》2010,30(11):3121-3125
为了满足高性能X-DSP浮点乘法器的性能、功耗、面积要求,研究分析了X型DSP总体结构和浮点乘法器指令特点,采用Booth 2编码算法和4∶2压缩树形结构,使用4级流水线结构设计实现了一款高性能低功耗浮点乘法器。使用逻辑综合工具Design Compiler,采用第三方公司0.13μm CMOS工艺库,对所设计的乘法器进行了综合,其结果为工作频率500MHz,面积67529.36μm2,功耗22.3424mW。  相似文献   

4.
浮点数求和与点积计算在科学计算,信号处理,图像处理等领域中广泛应用.对浮点和与点积计算的硬件结构进行了研究.在只有一次舍入误差的前提下,提出一种通用的浮点数求和算法和结构,利用重对阶方法,解决了多个粘贴位和尾数过抵消所产生的精度损失问题.然后将这种算法移植到浮点点积计算中.为了增加结构的通用性,将提出的结构和常用的SIMD计算单元进行结合.根据提出的算法,设计实现了FADD4和FDP4的硬件结构,和使用离散的加法器和乘法器来实现求和与点积的方法相比,计算速度分别提高了20.4%和42.1%.  相似文献   

5.
快速浮点加法器设计研究   总被引:4,自引:2,他引:2  
浮点加法器处于浮点处理器的关键路径,为提高浮点加法器的速度,对浮点加法器的关键部分进行了研究:采用了预测执行,并行运算技术。引用混合加法器,前导“1”检测采用快速的LOPV电路实现,混合加法器由输出选择电路对“ lulp”操作进行合并,提高了运算速度,这些技术在双精度FPU和24位浮点DSP中应用得到了理想的效果。  相似文献   

6.
一种快速的浮点乘法器结构   总被引:2,自引:0,他引:2  
一种支持IEEE754浮点标准的全流水结构的浮点乘法器被提出.在该浮点乘法器中,提出一种新型的双路浮点乘法结构.这种结构相比于全规模乘法器,在不增加面积的前提下,缩短乘法树关键路径延迟13.6%,提高了乘法器的执行频率.这种乘法器有3个周期的延迟,每个周期能接收一条单精度或双精度浮点乘法指令.使用FPGA进行验证,并使用标准单元实现.采用0.18μm的静态CMOS工艺,执行频率为384MHz,面积为732902.25μm^2.在相同工艺条件下,将这种结构与其他乘法器结构进行比较,结果表明这种结构是有效的.  相似文献   

7.
基于FPGA的高速流水线浮点乘法器设计   总被引:1,自引:0,他引:1  
设计了一种支持IEEE754浮点标准的32位高速流水线结构浮点乘法器.该乘法器采用新型的基4布思算法,改进的4:2压缩结构和部分积求和电路,完成Carry Save形式的部分积压缩,再由Catry Look-ahead加法器求得乘积.时序仿真结果表明该乘法器可稳定运行在80M的频率上,并已成功运用在浮点FFT处理器中.  相似文献   

8.
高效结构的多输入浮点乘法器在FPGA上的实现   总被引:1,自引:0,他引:1  
传统的多输入浮点乘法运算是通过级联二输入浮点乘法器来实现的,这种结构不可避免地使运算时延和所需逻辑资源成倍增加,从而难以满足高速数字信号处理的需求。本文提出了一种适合于在FPGA上实现的浮点数据格式和可以在三级流水线内完成的一种高效的多输入浮点乘法器结构,并给出了在Xilinx公司Virtex系列芯片上的测试数据。  相似文献   

9.
在数字通信、图像处理等应用领域中需要用到大量的矩阵乘法运算,并且它的计算性能是影响系统性能的关键因素.设计了一个全流水结构的并行双精度浮点矩阵乘法器以提高计算性能,并在Xilinx Virtex-5 LX155现场可编程门阵列(FPGA)上完成了方案的实现.乘法器中处理单元(PE)按阵列形式排列,在一个FPGA芯片上可集成10个PE单元实现并行计算.为了提高工作频率,PE单元采用流水线结构,并运用C-slow时序重排技术解决了环路流水线上"数据相关冲突"的问题.仿真结果表明,该乘法器的峰值计算性能可达到5 000 MFLOPS.此外,对不同维数的矩阵乘法进行了实验,其结果也证实了该设计达到了较高的计算性能.  相似文献   

10.
本文介绍一种用于高性能DSP的32位浮点乘法器设计,通过采用改进Booth编码的树状4-2压缩器结构,提高了速度,降低了功耗,该乘法器结构规则且适合于VLSI实现,单个周期内完成一次24位整数乘或者32位浮点乘。整个设计采用Verilog HDL语言结构级描述,用0.25um单元库进行逻辑综合.完成一次乘法运算时间为24.30ns.  相似文献   

11.
In the conventional floating point multipliers, the rounding stage is usually constructed by using a high speed adder for the increment operation, increasing the overall execution time and occupying a large amount of chip area. Furthermore, it may accompany additional execution time and hardware components for renormalization which may occur by an overflow from the rounding operation. A floating-point multiplier performing addition and IEEE rounding in parallel is designed by optimizing the operational flow based on the characteristics of floating point multiplication operation. A hardware model for the floating point multiplier is proposed and its operational model is algebraically analyzed in this research. The floating point multiplier proposed does not require any additional execution time nor any high speed adder for rounding operation. In addition, the renormalization step is not required because the rounding step is performed prior to the normalization operation. Thus, performance improvement and cost-effective design can be achieved by this approach.  相似文献   

12.
Single-precision floatingpoint computations may yield an arbitrary false result due to cancellation and rounding errors. This is true even for very simple, structured arithmetic expressions such as Horner's scheme for polynomial evaluation. A simple procedure will be presented for fast calculation of the value of an arithmetic expression to least significant bit accuracy in single precision computation. For this purpose in addition to the floating-point arithmetic only a precise scalar product (cf. [2]) is required. If the initial floatingpoint approximation is not too bad, the computing time of the new algorithm is approximately the same as for usual floating-point computation. If not, the essential progress of the presented algorithm is that the inaccurate approximation is recognized and corrected. The algorithm achieves high accuracy, i.e. between the left and the right bound of the result there is at most one more floating-point number. A rigorous estimation of all rounding errors introduced by floating-point arithmetic is given for general triangular linear systems. The theorem is applied to the evaluation of arithmetic expressions.  相似文献   

13.
Hardware Support for Interval Arithmetic   总被引:1,自引:0,他引:1  
A hardware unit for interval arithmetic (including division by an interval that contains zero) is described in this paper. After a brief introduction an instruction set for interval arithmetic is defined which is attractive from the mathematical point of view. These instructions consist of the basic arithmetic operations and comparisons for intervals including the relevant lattice operations. To enable high speed, the case selections for interval multiplication (9 cases) and interval division (14 cases) are done in hardware. The lower bound of the result is computed with rounding downwards and the upper bound with rounding upwards by parallel units simultaneously. The rounding mode must be an integral part of the arithmetic operation. Also the basic comparisons for intervals together with the corresponding lattice operations and the result selection in more complicated cases of multiplication and division are done in hardware. There they are executed by parallel units simultaneously. The circuits described in this paper show that with modest additional hardware costs interval arithmetic can be made almost as fast as simple floating-point arithmetic.  相似文献   

14.
本文介绍一种用于高性能DSP的32位浮点乘法器设计,通过采用改进Booth编码的树状4-2压缩器结构,提高了速度,降低了功耗,该乘法器结构规则且适合于VLSI实现,单个周期内完成一次24位整数乘或者32位浮点乘。整个设计采用Verilog HDL语言结构级描述,用0.25um单元库进行逻辑综合.完成一次乘法运算时间为24.30ns.  相似文献   

15.
In this note we study scaling rules and roundoff noise variances in a fixed-point implementation of the Kalman predictor for an ARMA time series observed noise free. The Kalman predictor is realized in a fast form that uses the so-called fast Kalman gain algorithm. The algorithm for the gain is fixed point. Scaling rules and expressions for rounding error variances are derived. The numerical results show that the fixed-point realization performs very close to the floating point realization for relatively low-order ARMA time series that are not too narrow band. The predictor has been implemented in 16-bit fixed-point arithmetic on an INTEL 8086 microprocessor, and in 16-bit floating-point arithmetic on an INTEL 8080. Fixed-point code was written in Assembly language and floating-point code was written in Fortran. Experimental results were obtained by running the fixed- and floating-point filters on identical data sets. All experiments were carried out on an INTEL MIDS 230 development system.  相似文献   

16.
Choosing an internal floating-point representation for a binary computer with given word-length is influenced by two factors: the size of the range of admissible numbers and the precision of the respective floating-point arithmetic. In this paper “precision” is defined by a statistical model of rounding errors. According to this definition base 4 floating-point arithmetic on an average produces smaller rounding errors than all other floating-point arithmetics with a base 2k, provided that the ranges of numbers have equal size.  相似文献   

17.
A method for accurately determining whether two given line segments intersect is presented. This method uses the standard floating-point arithmetic that conforms to IEEE 754 standard. If three or four ending points of the two given line segments are on a same vertical or horizontal line, the intersection testing result is obtained directly. Otherwise, the ending points and their connections are mapped onto a 3×3 grid, and the intersection testing falls into one of the five testing classes. The intersection testing method is based on our method for floating-point dot product summation, whose error bound is 1ulp. Our method does not have the limitation in the method of Gavrilova and Rokne (2000) that the product of two floating-point numbers is calculated by a twice higher precision floating-point arithmetic than that of the multipliers. Furthermore, this method requires less than one-fifth of the running time used by the method of Gavrilova and Rokne (2000), and our new method for calculating the sign of a sum of n floating-point numbers requires less than one-fifteenth of the running time used by ESSA.  相似文献   

18.
科学计算程序语言的浮点数机制研究   总被引:1,自引:0,他引:1  
王力 《计算机科学》2008,35(4):285-287
浮点数运算存在精度方面、比较方面以及舍入误差等方面的问题,而这些问题直接影响到科学计算的准确性、可靠性和安全性等等.目前有关浮点数的中文资料很少,很多教科书上在谈到浮点数时都是浅尝辄止,本文以C语言浮点数机制为研究基础,对浮点数的格式、精度与应用等方面问题进行了实证研究,获得了一些有用的结果.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号