首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Two main approaches are used, nowadays, to compute the roots of a zero-dimensional polynomial system. The first one involves Gröbner basis computation, and applies to any zero-dimensional system. But, it is performed withexact arithmetic and, usually, large numbers appear during the computation. The other approach is based on resultant formulations and can be performed with floating point arithmetic. However, it applies only to generic situations, leading to singular problems in several systems coming from robotics and computational vision, for instance.In this paper, reinvestigating the resultant approach from the linear algebra point of view, we handle the problem of genericity and present a new algorithm for computing the isolated roots of an algebraic variety, not necessarily of dimension zero. We analyse two types of resultant formulations, transform them into eigenvector problems, and describe special linear algebra operations on the matrix pencils in order to reduce the root computation to a non-singular eigenvector problem. This new algorithm, based on pencil decompositions, has a good complexity even in the non-generic situations and can be executed with floating point arithmetic.  相似文献   

2.
We present an algorithmic solution to the robustness problem in computational geometry, called controlled linear perturbation, and demonstrate it on Minkowski sums of polyhedra. The robustness problem is how to implement real RAM algorithms accurately and efficiently using computer arithmetic. Approximate computation in floating point arithmetic is efficient but can assign incorrect signs to geometric predicates, which can cause combinatorial errors in the algorithm output. We make approximate computation accurate by performing small input perturbations, which we compute using differential calculus. This strategy supports fast, accurate Minkowski sum computation. The only prior robust implementation uses a less efficient algorithm, requires exact algebraic computation, and is far slower based on our extensive testing.  相似文献   

3.
Graphics Processing Units (GPUs), originally developed for computer games, now provide computational power for scientific applications. In this paper, we develop a general purpose Lattice Boltzmann code that runs entirely on a single GPU. The results show that: (1) simple precision floating point arithmetic is sufficient for LBM computation in comparison to double precision; (2) the implementation of LBM on GPUs allows us to achieve up to about one billion lattice update per second using single precision floating point; (3) GPUs provide an inexpensive alternative to large clusters for fluid dynamics prediction.  相似文献   

4.
Performing computations with a low-bit number representation results in a faster implementation that uses less silicon, and hence allows an algorithm to be implemented in smaller and cheaper processors without loss of performance. We propose a novel formulation to efficiently exploit the low (or non-standard) precision number representation of some computer architectures when computing the solution to constrained LQR problems, such as those that arise in predictive control. The main idea is to include suitably-defined decision variables in the quadratic program, in addition to the states and the inputs, to allow for smaller roundoff errors in the solver. This enables one to trade off the number of bits used for data representation against speed and/or hardware resources, so that smaller numerical errors can be achieved for the same number of bits (same silicon area). Because of data dependencies, the algorithm complexity, in terms of computation time and hardware resources, does not necessarily increase despite the larger number of decision variables. Examples show that a 10-fold reduction in hardware resources is possible compared to using double precision floating point, without loss of closed-loop performance.  相似文献   

5.
A representation for numbers using two computer words is discussed, where the value represented is the ratio of the corresponding integers. This allows for better dynamic range and relative accuracy than single-precision fixed point, yet is less costly than floating point arithmetic. The scheme is easy to implement and particularly well suited for minicomputer applications that call for a great deal of numerical compulation. The techniques described have been used to implement a mathematical function subroutine package for a minicomputer as well as a number of applications programs in the machine vision and machine manipulation area.  相似文献   

6.
在DNA算术运算的理论模型中,普遍应用固定基数制,比如二进制、三进制。但是由于受到进位的影响,难以实现并行运算。基于Adleman-Lipton模型,分析了剩余数制的基本原理,改进了整数的DNA链表示,并将其应用于DNA算术运算,给出了剩余数制下进行DNA算术运算的算法模型。由于在剩余数制中,算术运算(加、减、乘)在剩余位之间无须进行进位计算,故可以降低运算过程的复杂度,而且有利于进行各个剩余位上的并行计算。  相似文献   

7.
在DNA算术运算的模型中普遍应用二进制,受制于进位的影响,难以实现并行运算。但在剩余数制中,算术运算(加、减、乘)在剩余位之间不存在进位,故可降低运算过程的复杂度,可以充分利用DNA计算巨大并行性的优势,简化实际编码的难度。基于Adleman-Lipton模型,分析了剩余数制的基本原理,基于特定的模数集,改进了整数的DNA链表示,并将其应用于DNA算术运算,给出了特定剩余数制下进行并行DNA算术运算的具体算法。  相似文献   

8.
Holmes  W.N. 《Computer》1997,30(3):65-73
A general-purpose arithmetic standard could give general computation the kind of reliability and stability that the floating-point standard brought to scientific computing. The author describes composite arithmetic as a possible starting point. He describes a formatting scheme for storage and display of exact and inexact numbers and an extended arithmetic with a number format as its basis. He also introduces possibilities for implementing the arithmetic and discusses the interface between the representations and the arithmetic  相似文献   

9.
In this paper we present a topologically correct and efficient version of the algorithm by Guibas and Stolfi (Algorithmica 7 (1992), pp. 381-413) for the exact computation of Delaunay and power triangulations in two dimensions. The algorithm avoids numerical errors and degeneracies caused by the accumulation of rounding errors in fixed length floating point arithmetic when constructing these triangulations.Most methods for computing Delaunay and power triangulations involve the calculation of two basic primitives: the INCIRCLE test and the CCW orientation test. Both primitives require the computation of the sign of a determinant. The key to our method is the exact computation of this sign and is based on an algorithm for determining the sign of the sum of a finite set of normalized floating point numbers of fixed mantissa length (machine numbers) exactly. The exact computation of the primitives allows the construction of the correct Delaunay and power triangulations. The method has been implemented and tested for the incremental construction of Delaunay and power triangulations. Tests have been conducted for different distributions of points for which non-exact algorithms may encounter difficulties, for example, slightly perturbed points on a grid or on a circle. Experimental results show that the performance of our implementation is comparable with that of a simple implementation of the incremental algorithm in single precision floating point arithmetic. For random distribution of points the exact algorithm is only 4 times slower than the inexact implementation. The algorithm is easy to implement, robust and portable as long as the input data to the algorithm remains exact.  相似文献   

10.
Matrix computations are both fundamental and ubiquitous in computational science, and as a result, they are frequently used in numerous disciplines of scientific computing and engineering. Due to the high computational complexity of matrix operations, which makes them critical to the performance of a large number of applications, their efficient execution in distributed environments becomes a crucial issue. This work proposes a novel approach for distributing sparse matrix arithmetic operations on computer clusters aiming at speeding-up the processing of high-dimensional matrices. The approach focuses on how to split such operations into independent parallel tasks by considering the intrinsic characteristics that distinguish each type of operation and the particular matrices involved. The approach was applied to the most commonly used arithmetic operations between matrices. The performance of the presented approach was evaluated considering a high-dimensional text feature selection approach and two real-world datasets. Experimental evaluation showed that the proposed approach helped to significantly reduce the computing times of big-scale matrix operations, when compared to serial and multi-thread implementations as well as several linear algebra software libraries.  相似文献   

11.
This paper consists of a discussion of the potential impact on computer science education of regarding computation as a property of the natural world, rather than just a property of artifacts specifically created for the purpose of computing. Such a perspective is becoming increasingly important: new computing paradigms based on the natural computational properties of the world are being created, scientific questions are being answered using computational ideas, and philosophical debates on the nature of computation are being formed. This paper discusses how computing education might react to these developments, goes on to discuss how these ideas can help to define computer science as a discipline, and reflects on our experience at Kent in teaching these subjects.  相似文献   

12.
Hardware designs need to obey constraints of resource utilization, minimum clock frequency, power consumption, computation precision and data range, which are all affected by the data type representation. Floating and fixed-point representations are the most common data types to work with real numbers where arithmetic hardware units for fixed-point format can improve performance and reduce energy consumption when compared to floating point solution. However, the right bit-lengths estimation for fixed-point is a time-consuming task since it is a combinatorial optimization problem of minimizing the accumulative arithmetic computation error. This work proposes two evolutionary approaches to accelerate the process of converting algorithms from floating to fixed-point format. The first is based on a classic evolutionary algorithm and the second one introduces a compact genetic algorithm, with theoretical evidence that a near-optimal performance, to find a solution, has been reached. To validate the proposed approaches, they are applied to three computing intensive algorithms from the mobile robotic scenario, where data error accumulated during execution is influenced by sensor noise and navigation environment characteristics. The proposed compact genetic algorithm accelerates the conversion process up to 10.2× against the state of art methods reaching similar bit precision and robustness.  相似文献   

13.
The Fourier transform has long been of great use in simulating mathematical or physical phenomena, especially in signal theory. However the finite length representation of numbers introduces round-off errors in computing. Here, developing a new point of view on the topic, we give an evaluation of the total relative mean square error in the computation of direct and fast Fourier transforms using floating point artihmetic. Thus we show that in direct Fourier transforms the output noise-to-signal ratio is equivalent to N or N2 according to whether the arithmetic is a rounding or a chopping one, whereas for fast Fourier transforms it is equivalent to log2(N) or [log2(N)]2, with N being the number of points of the signal. Good agreement with numerical results is observed.  相似文献   

14.
在计算流体力学领域中,由于流场求解的复杂性,设计出高效的并行算法成为了流场并行化计算的研究重点.以格子Boltzmann方法的理论应用为研究背景,把并行思想和格子Boltzmann方法在模拟流体流动中的计算问题结合起来,讨论了格子Boitzmann方法LBGK D2Q9模型的计算过程和计算特点.研究并实现了LBGK模型的分布式并行算法,并在自强3000上进行了算法的并行性能的分析和测试.结果表明,格子Boltzmann方法LBGKD2Q9模型适合大规模的并行计算,能提高计算的精度和速度,解决复杂流场计算问题.  相似文献   

15.
人脸表情识别已成为人工智能领域的重要研究课题,但传统的卷积神经网络需要庞大的计算资源使得其应用受限,而二值化卷积神经网络可通过快速与或运算代替原本的浮点乘法运算,大大降低了算法对计算资源的需求。论文提出了一种基于数据增强和二值化卷积神经网络的人脸表情识别算法,通过均值估计,在FER2013数据集上达到了66.15%的识别率,超越了部分基于浮点乘积运算的卷积网络,为表情识别算法移植到小型设备中提供了可能。  相似文献   

16.
SpMV的自动性能优化实现技术及其应用研究   总被引:1,自引:0,他引:1  
在科学计算中,稀疏矩阵向量乘(SpMV)是一个十分重要且经常被大量调用的计算内核.由于SpMV一般实现算法的浮点计算和存储访问次数比率非常低,且其存储访问模式极为不规则,其实际运行性能往往很低.通过采用寄存器分块算法和启发式分块大小选择算法,将稀疏矩阵分成小的稠密分块,重用保存在寄存器中向量x元素,可以提高该计算内核的性能.剖析和总结了OSKI软件包所采用的若干关键优化技术,并进行了实际应用性能测试.测试表明,在实际应用这些优化技术的过程中,应用程序对SpMV的调用次数要达到上百次的量级,才能抵消由于应用这些性能优化技术所带来的额外时间开销,取得性能加速效果.在Pentium 4和AMD Athlon平台上,测试了10个矩阵,其平均加速比分别达到了1.69和1.48.  相似文献   

17.
计算机课程教学与计算科学思想史研究   总被引:2,自引:2,他引:0  
本文分析了计算学科课程教学计划CCC2002的特点,并从计算机科学与技术方法论的角度探讨了基于知识背景开展计算学科课程教育的基本思想,另外还研究了计算科学思想史研究与基于知识背景计算学科课程教学的关系,同时在课程内容设置、教学组织实施、学生学科素养与能力培养等方面阐述了基于知识背景课程教学对计算机课程教学改革产生的重要影响。  相似文献   

18.
二维快速傅立叶变换(FFT)在一个传统概念的处理机上实现时,需要芯片具有更多的逻辑资源。本文给出了基于FPGA的自定义处理机(CCM)的二维FFT算法和实现。在CCM的Splash-2平台上实现了二维FFT,计算速度达到180Mflops,最快速度超过Sparc-10工作站的23倍。同时,对于一个N×N图像,这种实现方法可以满足二维FFT所需要的O(N2log2N)次的浮点算术运算。  相似文献   

19.
In this paper we discuss the paradigm of real-time processing on the lower level of computing systems. An arithmetical unit based on this principle containing addition, multiplication, division and square root operations is described. The development of the computation operators model is based on the imprecise computation paradigm and defines the concept of the adjustable calculation of a function that manages delay and the precision of the results as an inherent and parameterized characteristic. The arithmetic function design is based on well-known algorithms and offers progressive improvement in the results. Advantages in the predictability of calculations are obtained by means of processing groups of k-bits atomically and by using look-up tables. We report an evaluation of the operations in path time, delay and computation error. Finally, we present an example of our real-time architecture working in a realistic context. Higinio Mora-Mora received the BS degree in computer science engineering and the BS degree in business studies in University of Alicante, Spain, in 1996 and 1997, respectively. He received the PhD degree in computer science from the University of Alicante in 2003. Since 2002, he is a member of the faculty of the Computer Technology and Computation Department at the same university where he is currently an associate professor and researcher of Specialized Processors Architecture Laboratory. His areas of research interest include computer arithmetic and the design of floating points units and approximation algorithms related to VLSI design. Jerónimo Mora-Pascual received the BS degree in computer science engineering from University of Valencia (Spain), in 1994. Since 1994, he has been a member of the faculty of the Computer Technology and Computation department at the University of Alicante, where he is currently an associate professor. He completed his PhD in computer science at University of Alicante in 2001. He has worked on neural networks and its VLSI implementation. His current areas of research interest include the design of floating points units and its application for real-time systems and processors for geometric calculus. Juan Manuel García-Chamizo received his BS in physics at the University of Granada (Spain) in 1980, and the PhD degree in Computer Science at the University of Alicante (Spain) in 1994. He is currently a full professor and director of the Computer Technology and Computation department at the University of Alicante. His current research interests are computer vision, reconfigurable hardware, biomedical applications, computer networks and architectures and artificial neural networks. He has directed several research projects related to the above-mentioned interest areas. He is a member of a Spanish Consulting Commission on Electronics, Computer Science and Communications. He is also member and editor of some program committee conferences. Antonio Jimeno-Morenilla is associate professor in the Computer Technology and Computation department at the University of Alicante (Spain). He received his PhD from the University of Alicante in 2003. He concluded his bachelor studies at the EPFL (Ecole Polytechnique Fe’de’rale de Lausanne, Switzerland) and received his BS degree in computer science from the Polytechnical University of Valencia (Spain) in 1994. His research interests include sculptured surface manufacturing, CAD/CAM, computational geometry for design and manufacturing, rapid and virtual prototyping, 3D surface flattening, and high performance computer architectures. He has considerable experience in the development of 3D CAD systems for shoes. In particular, he has been involved in many government and industrial funded projects, most of them in collaboration with the Spanish Footwear Research Institute (INESCOP).  相似文献   

20.
The theory of parameterized computation and complexity is a recently developed subarea in theoretical computer science. The theory is aimed at practically solving a large number of computational problems that are theoretically intractable.The theory is based on the observation that many intractable computational problems in practice are associated with a parameter that varies within a small or moderate range. Therefore, by taking the advantages of the small parameters, many theoretically intractable problems can be solved effectively and practically. On the other hand, the theory of parameterized computation and complexity has also offered powerful techniques that enable us to derive strong computational lower bounds for many computational problems, thus explaining why certain theoretically tractable problems cannot be solved effectively and practically. The theory of parameterized computation and complexity has found wide applications in areas such as database systems, programming languages, networks, VLSI design, parallel and distributed computing, computational biology, and robotics. This survey gives an overview on the fundamentals, algorithms, techniques, and applications developed in the research of parameterized computation and complexity. We will also report the most recent advances and excitements, and discuss further research directions in the area.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号