期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An algorithm to solve integer linear systems exactly using numerical methods

《Journal of Symbolic Computation》2006,41(6):621-632

In this paper, we present a new algorithm for the exact solutions of linear systems with integer coefficients using numerical methods. It terminates with the correct answer in well-conditioned cases or quickly aborts in ill-conditioned cases. Success of this algorithm on a linear equation requires that the linear system must be sufficiently well-conditioned for the numeric linear algebra method being used to compute a solution with sufficient accuracy. Our method is to find an initial approximate solution by using a numerical method, then amplify the approximate solution by a scalar, and adjust the amplified solution and corresponding residual to integers so that they can be computed without large integer arithmetic involved and can be stored exactly. Then we repeat these steps to refine the solution until sufficient accuracy is achieved, and finally reconstruct the rational solution. Our approximating, amplifying, and adjusting idea enables us to compute the solutions without involving high precision software floating point operations in the whole procedure or large integer arithmetic except at the final rational reconstruction step. We will expose the theoretical cost and show some experimental results. 相似文献

2.

A fast dot-product algorithm with minimal rounding errors

L. Kobbelt 《Computing》1994,52(4):355-369

We present a new algorithm which computes dot-products of arbitrary length with minimal rounding errors, independent of the number of addends. The algorithm has anO(n) time andO(1) memory complexity and does not need extensions of the arithmetic kernel, i.e, usual floating-point operations. A slight modification yields an algorithm which computes the dot-product in machine precision. Due to its simplicity, the algorithm can easily be implemented in hardware. 相似文献

3.

ESOLID—a system for exact boundary evaluation

John Keyser Author Vitae Tim Culver^{Author Vitae} 《Computer aided design》2004,36(2):175-193

We present a system, ESOLID, that performs exact boundary evaluation of low-degree curved solids in reasonable amounts of time. ESOLID performs accurate Boolean operations using exact representations and exact computations throughout. The demands of exact computation require a different set of algorithms and efficiency improvements than those found in a traditional inexact floating-point based modeler. We describe the system architecture, representations, and issues in implementing the algorithms. We also describe a number of techniques that increase the efficiency of the system based on lazy evaluation, use of floating-point filters, arbitrary floating-point arithmetic with error bounds, and lower-dimensional formulation of subproblems.ESOLID has been used for boundary evaluation of many complex solids. These include both synthetic datasets and parts of a Bradley Fighting Vehicle designed using the BRL-CAD solid modeling system. It is shown that ESOLID can correctly evaluate the boundary of solids that are very hard to compute using a fixed-precision floating-point modeler. In terms of performance, it is about an order of magnitude slower as compared to a floating-point boundary evaluation system on most cases. 相似文献

4.

Highly accelerated simulations of glassy dynamics using GPUs: Caveats on limited floating-point precision

Peter H. Colberg Felix Höfling 《Computer Physics Communications》2011,(5):1120-1129

Modern graphics processing units (GPUs) provide impressive computing resources, which can be accessed conveniently through the CUDA programming interface. We describe how GPUs can be used to considerably speed up molecular dynamics (MD) simulations for system sizes ranging up to about 1 million particles. Particular emphasis is put on the numerical long-time stability in terms of energy and momentum conservation, and caveats on limited floating-point precision are issued. Strict energy conservation over 10⁸ MD steps is obtained by double-single emulation of the floating-point arithmetic in accuracy-critical parts of the algorithm. For the slow dynamics of a supercooled binary Lennard-Jones mixture, we demonstrate that the use of single-floating point precision may result in quantitatively and even physically wrong results. For simulations of a Lennard-Jones fluid, the described implementation shows speedup factors of up to 80 compared to a serial implementation for the CPU, and a single GPU was found to compare with a parallelised MD simulation using 64 distributed cores. 相似文献

5.

Least significant bit evaluation of arithmetic expressions in single-precision

Dr. S. M. Rump Dipl.-Math H. Böhm 《Computing》1983,30(3):189-199

Single-precision floatingpoint computations may yield an arbitrary false result due to cancellation and rounding errors. This is true even for very simple, structured arithmetic expressions such as Horner's scheme for polynomial evaluation. A simple procedure will be presented for fast calculation of the value of an arithmetic expression to least significant bit accuracy in single precision computation. For this purpose in addition to the floating-point arithmetic only a precise scalar product (cf. [2]) is required. If the initial floatingpoint approximation is not too bad, the computing time of the new algorithm is approximately the same as for usual floating-point computation. If not, the essential progress of the presented algorithm is that the inaccurate approximation is recognized and corrected. The algorithm achieves high accuracy, i.e. between the left and the right bound of the result there is at most one more floating-point number. A rigorous estimation of all rounding errors introduced by floating-point arithmetic is given for general triangular linear systems. The theorem is applied to the evaluation of arithmetic expressions. 相似文献

6.

Computations that require higher than double precision for robust and exact decision making

《国际计算机数学杂志》2012,89(5):595-605

Consider the computation of deciding relative orientations of objects undergoing multiple translations and rotations. Such an orientation test involves the computation of expressions based on arithmetic operations, square roots and trigonometric functions. The computation of signs of such expressions using double precision floating-point arithmetic in modern computers may result in errors. In this article we demonstrate the existence of examples where double precision is not sufficient to compute the correct sign of an expression. We consider (i) simple expressions involving only the four basic arithmetic operations, (ii) expressions involving the square-root function and (iii) expressions representing orientation tests in two- and three-dimensions involving objects undergoing arbitrary rotations by angles given in radians, thereby requiring the computation of trigonometric functions. We develop a system that uses requisite high precision for computing the correct sign of such expressions. The system uses our floating-point filter called L-filter and the bigfloat extended precision package in LEDA (Library of Efficient Data Types and Algorithms). 相似文献

7.

Fast transformation of three-dimensional graphics structures via mixed-point arithmetic

L.Robert Morris 《Computers & Graphics》1976,2(1):7-10

Real-time animation of three-dimensional structures on refreshed tube interactive graphics processors usually requires floating-point arithmetic hardware for matrix manipulation and object transformation by the resultant matrix. An algorithm is described which uses fixed-point hardware to effect the required transformations at speeds surpassing that possible with floating-point hardware with no effective loss in accuracy. The algorithm involves ‘mixed-point arithmetic’—limited precision floating-point multiplication, conversion to an integer-fraction doubleword, and fixed-point addition. 相似文献

8.

Compiler support for floating-point computation

Charles Farnum 《Software》1988,18(7):701-709

Predictability is a basic requirement for compilers of floating-point code—it must be possible to determine the exact floating-point operations that will be executed for a particular source-level construction. Experience shows that many compilers fail to provide predictability, either because of an inadequate understanding of its importance or from an attempt to produce locally better code. Predictability can be attained through careful attention to code generation and a knowledge of the common pitfalls. Most language standards do not completely define the precision of floating-point operations, and so a good compiler must also make a good choice in assigning precisions of subexpression computation. Choosing the widest precision that will be used in the expression usually gives the best trade-off between efficiency and accuracy. Finally, certain optimizations are particularly useful for floating-point and should be included in a compiler aimed at scientific computation. But predictability is more important than efficiency; obtaining incorrect answers fast helps no one. 相似文献

9.

The use of configurable computing for computational kernels in scientific simulations

《Future Generation Computer Systems》2006,22(1-2):67-79

In many scientific simulation codes, the bulk of the floating-point arithmetic required is done by a small number of compact computational kernels. In this paper, we explore the potential use of configurable computers to instantiate the hardware required for such kernels and, thus, improve their performance. We present algorithms and analysis for two such kernels: fast, problem-specific multipliers and the efficient evaluation of Taylor series. A novel aspect of the algorithm for Taylor series evaluation is that it takes advantage of the variable precision arithmetic available to a configurable computer. Experimental results obtained on a Xilinx field-programmable gate array (FPGA) are presented for the proposed algorithms. 相似文献

10.

提高定点精度的FPGA信号处理算法

徐彦凯双凯《计算机工程与科学》2012,34(9):197-200

为了满足速度、功耗等诸多限制的要求,数字信号处理算法常使用FPGA实现。而实现时由于硬件特点,通常将浮点运算转换成定点运算,但定点转换设计流程复杂、周期长,且存在数据范围和精度之间的矛盾。利用浮点数的优点,本文改进了基于FPGA的定点数的基本运算规则,有效解决了上述矛盾。本文详细论述了实现移位、加/减、乘、除基本运算模块的方法和步骤,最后以FIR数字滤波器为设计实例。仿真结果表明:改进的定点数算法比定点运算误差小、精度高、数据范围宽,能有效地防止溢出。相似文献

11.

Parallel B-Spline Surface Interpolation on a Mesh-Connected Processor Array

Cheng F. H. Wasilkowski G. W. Wang J. Y. Zhang C. M. Wang W. P. 《Journal of Parallel and Distributed Computing》1995,24(2)

A parallel implementation of Chebyshev method is presented for the B-spline surface interpolation problem. The algorithm finds the control points of a uniform bicubic B-spline surface that interpolates m × n data points on an m × n mesh-connected processor array in constant time. Hence it is optimal. Due to its numerical stability, the algorithm can successfully be used in finite precision floating-point arithmetic. 相似文献

12.

Efficient and accurate B-rep generation of low degree sculptured solids using exact arithmetic: I—representations

《Computer Aided Geometric Design》1999,16(9):841-859

We present efficient representations and algorithms for exact boundary computation on low degree sculptured CSG solids using exact arithmetic. Most of the previous work using exact arithmetic has been restricted to polyhedral models. In this paper, we generalize it to higher order objects, whose boundaries are composed of rational parametric surfaces. We present the underlying representations necessary in our approach, and provide an overview of the boundary computation algorithm, which is described in more detail in a followup paper. 相似文献

13.

扩展双精度浮点并行计算:MPI方法

下载免费PDF全文

杨灿群杨学军易会战《计算机工程与科学》2010,32(12):98

双精度浮点并行计算将不能满足高性能计算领域对计算精度的要求,但是目前还没有高性能的超双精度并行计算的解决方法。基于并行编程语言MPI,本文提出了扩展双精度浮点的并行计算实现方法,并且使用精度敏感的圆周率计算BBP算法验证了该方法的正确性和性能。相似文献

14.

Optimale Basiswahl für eine Gleitkomma-Arithmetik

Dr. Th. Kreifelts 《Computing》1973,11(4):353-363

Choosing an internal floating-point representation for a binary computer with given word-length is influenced by two factors: the size of the range of admissible numbers and the precision of the respective floating-point arithmetic. In this paper “precision” is defined by a statistical model of rounding errors. According to this definition base 4 floating-point arithmetic on an average produces smaller rounding errors than all other floating-point arithmetics with a base 2^k, provided that the ranges of numbers have equal size. 相似文献

15.

Numerical reproducibility for the parallel reduction on multi- and many-core architectures

《Parallel Computing》2015

On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and, therefore, non-reproducible mainly due to the non-associativity of floating-point operations. We introduce an approach to compute the correctly rounded sums of large floating-point vectors accurately and efficiently, achieving deterministic results by construction. Our multi-level algorithm consists of two main stages: first, a filtering stage that relies on fast vectorized floating-point expansion; second, an accumulation stage based on superaccumulators in a high-radix carry-save representation. We present implementations on recent Intel desktop and server processors, Intel Xeon Phi co-processors, and both AMD and NVIDIA GPUs. We show that numerical reproducibility and bit-perfect accuracy can be achieved at no additional cost for large sums that have dynamic ranges of up to 90 orders of magnitude by leveraging arithmetic units that are left underused by standard reduction algorithms. 相似文献

16.

Low-Cost Microarchitectural Support for Improved Floating-Point Accuracy

《Computer Architecture Letters》2007,6(1):13-16

Some processors designed for consumer applications, such as graphics processing units (CPUs) and the CELL processor, promise outstanding floating-point performance for scientific applications at commodity prices. However, IEEE single precision is the most precise floating-point data type these processors directly support in hardware. Pairs of native floating-point numbers can be used to represent a base result and a residual term to increase accuracy, but the resulting order of magnitude slowdown dramatically reduces the price/performance advantage of these systems. By adding a few simple microarchitectural features, acceptable accuracy can be obtained with relatively little performance penalty. To reduce the cost of native-pair arithmetic, a residual register is used to hold information that would normally have been discarded after each floating-point computation. The residual register dramatically simplifies the code, providing both lower latency and better instruction-level parallelism. 相似文献

17.

Error-bounded lossy compression of floating-point color buffers using quadtree decomposition

Jim Rasmusson Jacob Ström Tomas Akenine-Möller 《The Visual computer》2010,26(1):17-30

In this paper, we present a new color buffer compression algorithm for floating-point buffers. It can operate in either an approximate (lossy) mode or in an exact (lossless) mode. The approximate mode is error-bounded and the amount of introduced accumulated error is controlled via a few parameters. The core of the algorithm lies in an efficient representation and color space transform, followed by a hierarchical quadtree decomposition, and then hierarchical prediction and Golomb–Rice encoding. We believe this is the first lossy compression algorithm for floating-point buffers, and our results indicate significantly reduced color buffer bandwidths and negligible visible artifacts. 相似文献

18.

Deciding floating-point logic with abstract conflict driven clause learning

Martin Brain Vijay D’Silva Alberto Griggio Leopold Haller Daniel Kroening 《Formal Methods in System Design》2014,45(2):213-245

We present a bit-precise decision procedure for the theory of floating-point arithmetic. The core of our approach is a non-trivial, lattice-theoretic generalisation of the conflict-driven clause learning algorithm in modern sat solvers to lattice-based abstractions. We use floating-point intervals to reason about the ranges of variables, which allows us to directly handle arithmetic and is more efficient than encoding a formula as a bit-vector as in current floating-point solvers. Interval reasoning alone is incomplete, and we obtain completeness by developing a conflict analysis algorithm that reasons natively about intervals. We have implemented this method in the mathsat5 smt solver and evaluated it on assertion checking problems that bound the values of program variables. Our new technique is faster than a bit-vector encoding approach on 80 % of the benchmarks, and is faster by one order of magnitude or more on 60 % of the benchmarks. The generalisation of cdcl we propose is widely applicable and can be used to derive abstraction-based smt solvers for other theories. 相似文献

19.

Interval Constraint Plotting for Interactive Visual Exploration of Implicitly Defined Relations

Hickey Timothy J. Qju Zhe Van Emden Maarten H. 《Reliable Computing》2000,6(1):81-92

Conventional plotting programs adopt techniques such as adaptive sampling to approximate, but not to guarantee, correctness and completeness in graphing functions. Moreover, implicitly defined mathematical relations can impose an even greater challenge as they either cannot be plotted directly, or otherwise are likely to be misrepresented. In this paper, we address these problems by investigating interval constraint plotting as an alternative approach that plots a hull of the specified curve. We present some empirical evidence that this hull property can be achieved by a algorithm. Practical experience shows that the hull obtained is the narrowest possible whenever the precision of the underlying floating-point arithmetic is adequate. We describe IASolver, a Java applet (http://www.cs.brandeis.edu/~tim) that serves as testbed for this idea. 相似文献

20.

基于PC的不变矩实时计算算法 总被引：3，自引：1，他引：3

Alrawi Mohamme 杨杰张凤超《软件学报》2002,13(9):1765-1772

矩和不变矩是工业部件识别和检测的重要特征.几何矩的值必须实时计算.介绍了灰度图像二维几何矩的高效计算.尽管存在许多矩快速计算算法,但不能在没有特殊硬件工具的微机上实时计算.原因是这些快速算法虽减少了计算复杂性,但在计算过程中仍需要大量浮点运算.为了实现在微机上的实时计算,提出的算法将图像分成相同大小的块,每图像块运用定点运算计算各自矩,然后运用浮点运算计算整个图像的矩.这种计算模式不需要近似而是精确计算,然而对于每个图像块不采用变换不容易克服溢出问题,在高效计算各图像块矩过程中使用了改进的Hatamian滤波器.实验结果表明,提出的算法大大减少了浮点运算次数,大大提高了图像矩计算速度.该算法可有效应用于复杂工业部件的实时识别和检测. 相似文献