首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
This paper considers some aspects of the implementation of interval arithmetic built on IEEE floating-point systems. Interval operations and functions on arguments involving special elements, Not-a-Numbers (NaNs) and signed zero, supported by the IEEE floating-point formats are discussed. A simple model of interval exceptions and their handling in IEEE non-trapping mode is proposed and interval operations on arguments involving NaNs are defined. Based on the floating-point exceptions and their handling, the proposed model provide consistency between interval and IEEE arithmetics.  相似文献   

2.
The IEEE 754 and 854 standards regulate the behaviour of real floating-point arithmetic, as implemented in most current hard- and software systems. Although a myriad of libraries for complex floating-point arithmetic is available and in use, there is no general consensus on their implementation. The International C Standard describes in its Annex G guidelines for the implementation of complex arithmetic, in order to achieve a similar behaviour of complex floating-point arithmetic across C-language compliant implementations. In Section 2 we summarize its recommendations and outline the problems inherent to this approach. In Section 3 we describe how the lack of reliability, when computing certain complex-valued expressions, can be overcome. Throughout the discussion the rounding mode is assumed to be round-to-nearest, as in Annex G.  相似文献   

3.
A method for accurately determining whether two given line segments intersect is presented. This method uses the standard floating-point arithmetic that conforms to IEEE 754 standard. If three or four ending points of the two given line segments are on a same vertical or horizontal line, the intersection testing result is obtained directly. Otherwise, the ending points and their connections are mapped onto a 3×3 grid, and the intersection testing falls into one of the five testing classes. The intersection testing method is based on our method for floating-point dot product summation, whose error bound is 1ulp. Our method does not have the limitation in the method of Gavrilova and Rokne (2000) that the product of two floating-point numbers is calculated by a twice higher precision floating-point arithmetic than that of the multipliers. Furthermore, this method requires less than one-fifth of the running time used by the method of Gavrilova and Rokne (2000), and our new method for calculating the sign of a sum of n floating-point numbers requires less than one-fifteenth of the running time used by ESSA.  相似文献   

4.
黄兆伟  王连明 《计算机应用研究》2020,37(9):2762-2765,2771
针对目前采用IEEE 754浮点标准设计的FPGA浮点运算器中吞吐率与资源利用率低等问题,提出一种运算精度与运算器数量可配置的并行浮点向量乘法运算单元。通过浮点运算器的指数、尾数位数可配置化设计,提高系统资源利用率,并将流水线技术与并行结构结合,提高数据吞吐率。以EP4CE115型FPGA为测试平台,当配置10组FP14运算器时,系统的逻辑资源占用约为4.2%,峰值吞吐率可达4.5 GFLOPS。结果表明,提出的浮点向量乘法单元有效提高了FPGA资源利用率与运算吞吐率,同时具有高度的可移植性与通用性,适用于FPGA向量乘法运算的加速。  相似文献   

5.
童静吴柯  王怀兴 《微机发展》2005,15(2):18-20,24
Neuron C是一种专门为Neuron芯片设计的程序设计语言。它在ANSIC的基础上进行了扩展,是开发LonWorks应用的有力工具。Neuron C不直接支持ANSIC中浮点数的算术和比较运算,但是它提供了一个浮点函数库,从而允许使用符合IEEE754标准的浮点数。文中详细介绍了Neuron C中浮点数据类型的定义、浮点常量的生成方法和浮点函数库的使用。通过一个实例LonWorks网络,演示了浮点数据的使用。  相似文献   

6.
《Computer》1980,13(1):68-79
This guide to an IEEE draft standard provides practical algorithms for floating-point arithmetic operations and suggests the hardware/software mix for handling exceptions.  相似文献   

7.
Kahan  W. Zuras  D. 《Computer》2005,38(5):91-94
IEEE 754 a standard for binary floating-point arithmetic has revolutionized the portability and reliability of programs that use binary floating-point arithmetic. Floating point is almost universally implemented with special-purpose hardware that tucks into a small corner of the CPU chip and runs in the hundreds of Mflops to Gflops range. Single-stepping through today's floating-point software to debug it often turns out to be futile. The concept of a NaN, standing for "not a number", evolved from an "indefinite" in Seymour Cray's CDC 6600. IEEE 754, by default, requires an untrapped "invalid operation", to signal itself by raising a flag and to deliver a NaN just when any other result, be it finite or infinite, would cause worse confusion. The NaN lets a program retain control unless the program or programmer directs its cancellation upon an invalid operation. Thus, a program conducting a search can return to the realm being searched after an accidental foray beyond a boundary whose existence and location were previously unknown. A sNaN differs from the other quiet NaNs by traooing any attempt to perform arithmetic upon it; then a trap-handler must interpret this sNaN.  相似文献   

8.
席伟俤  李伟刚 《测控技术》2017,36(11):115-118
航空发动机FADEC系统控制软件的计算精度和运行效率是一对不可缺少的特性.为提高航空发动机FADEC系统控制软件的浮点计算的计算精度和运行效率,从IEEE 754浮点数格式、浮点数的表示形式、浮点数四则运算的精度方面展开分析,并结合FADEC系统控制软件项目实际应用案例的数据结果,验证了精度分析结果的正确性,并以此为基础针对FADEC系统控制软件的浮点算法设计提出了设计准则,有助于提高控制软件的可靠性和安全性,可推广至其他行业的控制领域应用.  相似文献   

9.
Single-precision floatingpoint computations may yield an arbitrary false result due to cancellation and rounding errors. This is true even for very simple, structured arithmetic expressions such as Horner's scheme for polynomial evaluation. A simple procedure will be presented for fast calculation of the value of an arithmetic expression to least significant bit accuracy in single precision computation. For this purpose in addition to the floating-point arithmetic only a precise scalar product (cf. [2]) is required. If the initial floatingpoint approximation is not too bad, the computing time of the new algorithm is approximately the same as for usual floating-point computation. If not, the essential progress of the presented algorithm is that the inaccurate approximation is recognized and corrected. The algorithm achieves high accuracy, i.e. between the left and the right bound of the result there is at most one more floating-point number. A rigorous estimation of all rounding errors introduced by floating-point arithmetic is given for general triangular linear systems. The theorem is applied to the evaluation of arithmetic expressions.  相似文献   

10.
Some processors designed for consumer applications, such as graphics processing units (CPUs) and the CELL processor, promise outstanding floating-point performance for scientific applications at commodity prices. However, IEEE single precision is the most precise floating-point data type these processors directly support in hardware. Pairs of native floating-point numbers can be used to represent a base result and a residual term to increase accuracy, but the resulting order of magnitude slowdown dramatically reduces the price/performance advantage of these systems. By adding a few simple microarchitectural features, acceptable accuracy can be obtained with relatively little performance penalty. To reduce the cost of native-pair arithmetic, a residual register is used to hold information that would normally have been discarded after each floating-point computation. The residual register dramatically simplifies the code, providing both lower latency and better instruction-level parallelism.  相似文献   

11.
针对目前浮点运算软件实现速度慢,不能满足嵌入式处理器实时性要求以及运算种类有限等问题,提出了一种基于RISC-V指令集的浮点处理器,能够执行加法、减法、乘法、除法、平方根、乘累加以及比较运算,完全符合IEEE 754-2008标准。在VCS仿真环境下对浮点处理器进行了功能验证,各模块均能满足正确性要求。将浮点处理器与一款开源处理器核蜂鸟E203集成,使用SMIC 0.18工艺库完成了逻辑综合,并在FPGA上对设计进行了测试。结果表明,该浮点处理器的逻辑门数仅为24 200,吞吐量为150 MFLOPS,与已公开文献的设计方案相比,硬件面积分别减少7%、1.5%。综合运行频率可达100 MHz。  相似文献   

12.
基于决策图的字级模型检验方法虽然能完全验证运算电路,但它从有缺陷的设计中发现系统规范的反例所需时间较长.而基于SAT的有界模型检验方法虽然能较快地发现反例,但它不支持包含数学公式的系统规范,因而难以用于验证运算电路.提出了基于SAT的字级模型检验方法,该方法将CNF扩展为能混合布尔公式和数学公式的E—CNF用以表示设计和系统规范,并对有界模型检验工具和SAT求解器进行字级的扩展,使它们能分别生成和处理E—CNF.龙芯2号微处理器浮点除法功能部件验证同时采用了基于*PHDD和基于SAT的字级模型检验方法.数据表明,基于SAT的字级模型检验方法能快速地发现运算电路中的设计缺陷.两种方法互为补充,在能完全验证设计的同时显著缩短了设计周期.  相似文献   

13.
The WEDSP32C high-performance, programmable digital signal processor supports 32-bit floating-point arithmetic and is upwardly compatible with its predecessor, the WEDSP32. Because it is implemented in 0.75-μm (effective channel length) CMOS technology, the second-generation device achieves high functional density with low power consumption. The DSP32C offers the following features: 25-Mflop operation; 16-Mb/s serial-input and serial-output ports; a 160-bit, parallel I/O port for control and data transfer; interrupt facilities; single-instruction μ-law and A-law data conversions; single-instruction conversions between integers and floating-point data; a byte-addressable, on-chip memory that is extendable off chip; direct memory access to and from internal and external memory via parallel and serial I/O ports; 16 Mbytes of address space; and IEEE Std. 754 floating-point format conversion. The authors describe the DSP32C's instruction set, architecture, and application development tools. The latter includes an assembler, a simulator, an optimizing C compiler, and special-purpose hardware  相似文献   

14.
Dr. G. Bohlender 《Computing》1980,24(2-3):149-160
In numerical computations mainly real and complex numbers, intervals as well as matrices and vectors with such components occur. It is well known that the arithmetic operations with real numbers, complex numbers etc. can be carried over to real floating-point numbers, complex floating-point numbers etc. using roundings. This proceeding results in agreeable arithmetic-, order- and compatibility-properties for an abundance of numerical data types and the accompanying arithmetic operations. Most programming languages however only provide real floating-point numbers; all the other data types and operations have to be simulated, e. g. in the form of arrays and procedure calls, which often causes loss of accuracy and arithmetic properties. Furthermore the complicate notation makes programs difficult to read. Therefore in this article an extension of PASCAL is presented which serves as an example for the way these numerical data types can be embedded into the syntax of a programming language.  相似文献   

15.
为满足现代数字信号处理中大量数据的运算需求,利用ARM946和Xilinx公司的现场可编程门阵列芯片逻辑资源和IP库,设计专门用于浮点复数向量运算的64位协处理器,对相关浮点运算进行优化,并在硬件仿真平台上进行测试。结果表明,该协处理器可使浮点复数向量运算性能得到大幅提高。  相似文献   

16.
描述了一个流水线运行的、符合IEEE 75 4单精度浮点标准的加法器的全定制设计。该浮点加法器的设计基于SMIC 1 .8V 0 .1 8μm 1p6mCMOS工艺 ,将应用于高性能 32位CPU的浮点运算单元中。该设计在研究快速实现算法结构的基础上 ,采用全定制的电路及版图设计方法 ,提高了浮点加法器的工作速度 ,降低了芯片功耗 ,并通过减少芯片面积 ,有效降低芯片量产时的成本  相似文献   

17.
并行浮点加法器架构与核心算法的研究   总被引:1,自引:0,他引:1  
考虑到浮点运算在图形处理中的重要作用,依据速度和面积的优化原理,文章从两个方面对FAU结构中最复杂的双精度浮点加法进行了研究。其一:在结构上采用了三条相互并行的主线,设计了一种尽可能并行处理的三级浮点流水结构,极大地提高了运算的速度,节约了芯片资源;其二:对结构中制约浮点加法速度的关键运算——尾加和移位操作进行了创新设计与实现,并就设计的先进性和高速性与传统设计进行了参数比较和综合分析。  相似文献   

18.
Recursive procedures used for sequential calculations of polynomial basis coefficients in discrete orthogonal moments produce unreliable results for high moment orders as a result of error accumulation. This paper demonstrates accurate reconstruction of arbitrary-size images using full-order (orders as large as the image size) Tchebichef and Krawtchouk moments by calculating polynomial coefficients directly from their definition formulas in hypergeometric functions and by creating lookup tables of these coefficients off-line. An arbitrary precision calculator is used to achieve greater numerical range and precision than is possible with software using standard 64-bit IEEE floating-point arithmetic. This reconstruction scheme is content and noise independent.  相似文献   

19.
The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable performance-standard compliance trade-offs. A set of floating-point libraries composed of adder/subtracter, multiplier, divisor, square root, exponential, logarithm and power function are presented. Each library has been designed taking into account special characteristics of current FPGAs, and with this purpose we have adapted the IEEE floating-point standard (software-oriented) to a custom FPGA-oriented format. Extended experimental results validate the design decisions made and prove the usefulness of reducing the format complexity.  相似文献   

20.
Choosing an internal floating-point representation for a binary computer with given word-length is influenced by two factors: the size of the range of admissible numbers and the precision of the respective floating-point arithmetic. In this paper “precision” is defined by a statistical model of rounding errors. According to this definition base 4 floating-point arithmetic on an average produces smaller rounding errors than all other floating-point arithmetics with a base 2k, provided that the ranges of numbers have equal size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号