首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this contribution we present a new CORDIC architecture called ‘semi-flat’ which reduces considerably the latency time and the amount of hardware. In our semi-flat architecture the first rotations are executed with an unfolded scheme but the remaining iterations are flattened using a fast redundant addition tree. Detailed comparisons with other major contributions show that our semi-flat redundant CORDIC is 30% faster and occupy 39% less silicon area.  相似文献   

2.
This paper presents a novel modified Coordinate Rotation Digital Computer (CORDIC) architecture that computes values of sine and cosine in a single cycle. The proposed method utilises angle-recoding technique to design a modified CORDIC algorithm. Multiple iterations are merged in the modified algorithm using memory storage for initial iterations and employing inverse recoding to generate constant multiplication factors for the remaining iterations. Scale factor of the algorithm remains constant, as these factors are independent of intermediate directions of rotation. In addition, the architecture is mapped onto a single CORDIC computation element that requires only a single cycle to compute the result. These multiplications are implemented using dedicated hardware multipliers in Field Programmable Gate Arrays and customised fixed-point multiplication techniques for Application Specific Integrated Circuits. Implementation results show that the proposed IS-CORDIC architecture is 7.9 times more efficient than basic CORDIC and has reduced area-delay product than current state of the art implementations.  相似文献   

3.
在现代数字信号处理领域中,CORDIC算法是一种重要的数学计算方法。该算法采用一种迭代的方式,运算简便,被广泛应用于乘除法、开方以及一些三角函数运算当中。但CORDIC算法需要较高的迭代级数以保证运算精度,在进行FPGA实现时仍然会消耗较多的硬件逻辑资源。为进一步减少CORDIC算法实现时的资源消耗,设计并实现了一种基于折叠变换的CORDIC算法。相比传统的流水结构CORDIC算法,该折叠结构的CORDIC算法消耗的硬件资源大大减少。文中给出了这一方法的实现结构,并给出了仿真结果。  相似文献   

4.
The compensation of scale factor imposes significant computation overhead on the CORDIC algorithm. In this paper we present two algorithms and the corresponding architectures (one for both rotation and vectoring modes and the other only for rotation mode) to perform the scaling factor compensation in parallel with the classical CORDIC iterations. With these methods, the scale factor compensation overhead is reduced to a couple of iterations for any word length. The architectures presented have been optimized for conventional and redundant arithmetic.  相似文献   

5.
This article presents a low hardware complexity for exponent calculations based on CORDIC. The proposed CORDIC algorithm is designed to overcome major drawbacks (scale-factor compensation, low range of convergence and optimal selection of micro-rotations) of the conventional CORDIC in hyperbolic mode of operation. The micro-rotations are identified using leading-one bit detection with uni-direction rotations to eliminate redundant iterations and improve throughput. The efficiency and performance of the processor are independent of the probability of rotation angles being known prior to implementation. The eight-staged pipelined architecture implementation requires an 8?×?N ROM in the pre-processing unit for storing the initial coordinate values; it no longer requires the ROM for storing the elementary angles. It provides an area-time efficient design for VLSI implementation for calculating exponents in activation functions and Gaussain Potential Functions (GPF) in neural networks. The proposed CORDIC processor requires 32.68% less adders and 72.23% less registers compared to that of the conventional design. The proposed design when implemented on Virtex 2P (2vp50ff1148-6) device, dissipates 55.58% less power and has 45.09% less total gate count and 16.91% less delay as compared to Xilinx CORDIC Core. The detailed algorithm design along with FPGA implementation and area and time complexities is presented.  相似文献   

6.
A new high-speed redundant CORDIC processor is designed and implemented based on the double rotation method, which turns out to be the two-dimensional (2D) Householder CORDIC, a special case of the generalized Householder CORDIC in the 2D Euclidean vector space. The new processor has the advantages of regular structure and high throughput rate. The pipelined structure with radix-2 signed-digit (SD) redundant arithmetic is adopted to reduce the carry-propagation delay of the adders while the digit-serial structure alleviates the burden of the hardware cost and I/O requirement. Compared to previously proposed designs, the new CORDIC processor preserves the constant scaling factor, an important merit of the original CORDIC, and thus does not require any complicated division or square-root operations for variable scaling factor calculation. Furthermore, the processor is well suited to VLSI implementation since it does not call for any irregularly inserted correcting iterations. Both angle calculation mode for computing trigonometric function and vector rotation mode for plane rotations are supported. Practical VLSI chip implementation of the fixed-point redundant CORDIC processor using 0.6 m standard cell library is given including detailed numerical error analysis.  相似文献   

7.
This paper presents an efficient approach for computing the N-point (N=2n) scaled discrete cosine transform (DCT) with the coordinate rotation digital computer (CORDIC) algorithm. The proposed algorithm is based on an indirect approach for computing the DCT so that the vector rotations are completely separated from the other operations and placed at the end of the DCT unit. As a result, unlike the other CORDIC-based DCT architectures, the proposed scaled DCT architecture does not require scale factor compensation. The number of CORDIC iterations is minimized through the optimal angle recoding method based on the three-value CORDIC algorithm. Although this three-value CORDIC algorithm results in different scale factors for different angles, this does not incur any extra hardware in the proposed scaled DCT architecture  相似文献   

8.
Evaluation of CORDIC Algorithms for FPGA Design   总被引:8,自引:0,他引:8  
This paper presents a study of the suitability for FPGA design of full custom based CORDIC implementations. Since all these methods are based on redundant arithmetic, the FPGA implementation of the required operators to perform the different CORDIC methods has been evaluated. Efficient mappings on FPGA have been performed leading to the fastest implementations. It is concluded that the redundant arithmetic operators require a 4 to 5 times larger area than the conventional architecture and the speed advantages of the full custom design has been lost. That is due to the longer routing delays caused by the increase of the fan-out and the number of nets. Therefore, the redundant arithmetic based CORDIC methods are not suitable for FPGA implementation, and the conventional two's complement architecture leads to the best performance.  相似文献   

9.
This paper presents architectural and algorithmic approaches for achieving high-speed CORDIC processing in both of the two operating modes: vectoring and rotation. For vectoring mode CORDIC processing, a modified architecture is proposed, which aims at reduction of computation time by overlapping the stages for redundant addition and selection of rotation direction. In addition, a novel rotation direction prediction scheme for rotation mode CORDIC is presented. The method is based on approximation of the binary angle input to a number with the arctangent weights (tan–1 2–i). The implementation is designed to keep the fast timing characteristics of redundant arithmetic in the x/y path of the CORDIC processing. The characteristics are analyzed with respect to latency time and area, and compared with those obtained by conventional CORDIC implementations. The results show that the proposed techniques reduce not only the block latency but also the overall computation time. Thus, they achieve higher throughput in pipelining.  相似文献   

10.
This paper focuses on developing an area efficient hyperbolic Coordinate Rotation Digital Computer (CORDIC) algorithm with performance improvement. The algorithm eliminates the need of scale factor calculation in the Range of Convergence (ROC). At the same time the range of convergence offered is higher than the conventional CORDIC ROC in the hyperbolic rotation mode. Being the only kind of algorithm in hyperbolic rotation with sign sequence μ?=?1 always, one complete operation requires just 5 iterations. Thus the pipelined implementation has 5 stages which provides a 50% increase in throughput in comparison to conventional CORDIC. As far as the area improvement is considered, 16-bit processor can be realized using 56% less number of full adders required by Flat-CORDIC. The x and y datapath are based on series expansion of hyperbolic functions. The complete algorithm design along with pipelined architecture implementation is detailed.  相似文献   

11.
In this work we extend the radix-4 CORDIC algorithm to the vectoring mode (the radix-4 CORDIC algorithm was proposed recently by the authors for the rotation mode). The extension to the vectoring mode is not straightforward, since the digit selection function is more complex in the vectoring case than in the rotation case; as in the rotation mode, the scale factor is not constant. Although the radix-4 CORDIC algorithm in vectoring mode has a similar recurrence as the radix-4 division algorithm, there are specific issues concerning the vectoring algorithm that demand dedicated study. We present the digit selection for nonredundant and redundant arithmetic (following two different approaches: arithmetic comparisons and table look-up), the computation and compensation of the scale factor, and the implementation of the algorithm (with both types of digit selection) in a word-serial architecture. When compared with conventional radix-2 (redundant and non-redundant) architectures, the radix-4 algorithms present a significant speed up for angle calculation. For the computation of the magnitude the speed up is very slight, due to the nonconstant scale factor in the radix-4 algorithm.  相似文献   

12.
A new CORDIC algorithm is presented that can be used for the vectoring mode without requiring constant scaling factors. The algorithm can also be used to carry out complete transformation from rectangular co-ordinates (x,y) to polar co-ordinates (ρ&thetas;) in each iteration. The exponent difference of x and y is computed so as to speed up convergence. This new CORDIC algorithm has an average of 0.75 n iterations for n-bit input data and can achieve>94.78% 23 bit accuracy. It is also suitable for VLSI chip implementation due to the regular architecture required  相似文献   

13.
为了既能提高Hough变换的计算速度,同时能保持精度以及不大的存储量,讨论了Hough变换和CORDIC算法各自的特点,论证了用CORDIC算法实现Hough变换的可行性。研究了采用流水线构架的CORDIC算法,提出了一种基于CORDIC混合基算法的特殊处理器来计算Hough变换,使迭代次数减少1/4,并可显著改善迭代的速度。这种方法占用资源面积比较小,并且结构规则简单,适合于FPGA设计实现,具有较高应用价值。  相似文献   

14.
This paper presents a modified coordinate rotation digital computer (CORDIC) algorithm implemented in parallel architecture to generate sine and cosine waveform. Since CORDIC is a combination of only additions and shifts, it can be efficiently implemented in hardware. The proposed algorithm further approximates the way of computing rotation angle based on Taylor series in order to reduce the usage of Read-Only-Memory (ROM) table. Thus area and power is reduced due to partial usage of ROM storage. The precision remains the same as the original algorithm. The modified 32-bits pipeline CORDIC are implemented in Spartan XC3S500E device using Xilinx ISE 12.3 design suite. The result is compared with original CORDIC and Xilinx coregen in device utilization. It is shown that the logic usage is 31 FFs and 285 FFs less than the original design and Xilinx core, respectively. When compared with the original design, the signal power and total power reduction at 40 MHz clocks are 7.69 % and 1.35 %, respectively. The bit error remains at 10?8 dB level. The SNR of modified CORDIC is about 2 dB lower, which is acceptable in wave generation.  相似文献   

15.
《Microelectronics Journal》2002,33(1-2):77-89
Despite further refinements of the CORDIC algorithm with the introduction of redundant arithmetic and higher radix CORDIC techniques, in terms of circuit latency and performance, the iterative nature remains to be the major bottleneck for further optimization. A technique known as flat CORDIC, in which the conventional X and Y recurrences are successively substituted to express the final vectors in terms of the initial vectors, can be used to eliminate the iterative process. In this paper, the techniques devised for the VLSI efficient implementation of a pipelined 16-bit flat CORDIC based sine–cosine generator are presented. Three possible schemes to pipeline the 16-bit flat CORDIC design have been presented to demonstrate the suitability of the proposed method to realize high throughput implementations. The 16-bit architecture has been synthesized with 0.35 μ CMOS process library using Synopsys. Finally, a detailed comparison with other major contributions show that the flat CORDIC based sine–cosine generators are, on average, 30% faster and occupy some 30% less silicon area.  相似文献   

16.
Quaternions have offered a new paradigm to the signal processing community: to operate directly in a multidimensional domain. We have recently introduced the quaternionic approach to the design and implementation of paraunitary filter banks: four- and eight-channel linear-phase paraunitary filter banks, including those with pairwise-mirror-image symmetric frequency responses. The hypercomplex number theory is utilized to derive novel lattice structures in which quaternion multipliers replace Givens (planar) rotations. Unlike the conventional algorithms, the proposed computational schemes maintain losslessness regardless of their coefficient quantization. Moreover, the one regularity conditions can be expressed directly in terms of the quaternion lattice coefficients and thus easily satisfied even in finite-precision arithmetic. In this paper, a novel approach to realizing CORDIC-lifting factorization of paraunitary filter banks is presented, which is based on the embedding of the CORDIC algorithm inside the lifting scheme. Lifting allows for making multiplications invertible. The 2D CORDIC engine using sparse iterations and asynchronous pipeline processor architecture based on the embedded CORDIC engine as stage of processor is reported. Also it is necessary to notice, that the quaternion multiplier lifting scheme based on the 2D CORDIC algorithm is the structural decision for the lossless digital signal processing. This approach applies to very practical filter banks, which are essential for image processing, and addresses interesting theoretical questions.  相似文献   

17.
孙悦  王传伟  康龙飞  叶超  张信 《电子学报》2018,46(12):2978-2984
针对传统CORDIC算法进行高精度幅度相位解算时迭代次数过多、时延较长、相位收敛较慢等局限,提出了一种基于最佳一致逼近方法的幅度与相位补偿算法,即利用传统CORDIC算法迭代一定次数后得到的向量信息,采用最佳一致逼近方法对幅度和相位分区间进行一阶多项式补偿,有效提高了计算精度.仿真及实测结果表明,对传统CORDIC算法4次迭代后的结果进行补偿,幅度相对误差可达到10-5量级、相位绝对误差可达到10-5度量级,最大输出时延不大于100ns.在使用部分专用乘法器的条件下,寄存器消耗降低了42.5%,查找表消耗降低了15.5%.采用该补偿算法,每多一次CORDIC迭代其相位精度可提高约一个数量级.因此,本文提出的补偿CORDIC算法在迭代次数、计算精度等方面优于传统CORDIC算法,适合于高精度计算的场合.  相似文献   

18.
The authors present a novel CORDIC-based adaptive algorithm and a pipelined architecture for unnormalised lattice prediction filter. Previously, they have presented a CORDIC-based adaptive lattice filtering (CALF) algorithm for normalised lattice filters which features a sign-sign direct (rotation) angle updating scheme (Hu and Liao, 1992). The authors consider a delayed CALF (DeCALF) algorithm in which the rotation angle is updated based on `delayed' prediction errors. In doing so, they are able to develop a fully pipelined implementation of DeCALF which achieves B-fold throughput rate increase where B is the number of CORDIC iterations (stages). This is accomplished at insignificant hardware overhead and minor parameter tracking performance degradation  相似文献   

19.
改进型CORDIC算法的研究与实现   总被引:1,自引:1,他引:0  
陈婧 《现代电子技术》2011,(24):165-167
CORDIC的运算速度问题是研究的热点。为了解决CORDIC运算速度慢的问题,采用跳过零点思想,跳过输入相位值中为0的位,有效的减少了迭代次数。利用ISE仿真技术多次仿真综合。验证出改进型的CORDIC算法,在保证算法的运算精度基础上,明显地改善了CORDIC的运算速度,尤其针对于一些特殊的旋转角度,利用极少的旋转就达到结果。最终利用FPGA实现改进后CORDIC算法。  相似文献   

20.
We present the design of parallel architectures for the computation of the Hough transform based on application-specific CORDIC processors. The design of the circular CORDIC in rotation mode is simplified by the a priori knowledge of the angles participating in the transform and a high throughput is obtained through a pipelined design combined with the use of redundant arithmetic (carry save adders in this paper). Saving area is essential to the design of a pipelined CORDIC and can be achieved through the reduction in the number of microrotations and/or the size of the coefficient ROM. To reduce the number of microrotations we incorporate radix 4, when it is possible, or mixed radix (radix 2 and radix 4) in the design of the processor, achieving a reduction by half and 25% microrotations, respectively, with respect to a totally radix 2 implementation. Furthermore, if we allocate two circular CORDIC rotators into one processors then the size of the shared coefficient ROM is only 50% of the ROM of a design based on two separated rotators. Finally, we have also incorporated additional microrotations in order to reduce the scale factor to one. The result is a pipelined architecture which can be easily integrated in VLSI technology due to its regularity and modularity.This work was supported by the Ministry of Education and Science (CICYT) of Spain under project TIC-92-0942.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号