针对地面数字视频广播(DVB-T)系统中高速FFT处理器的设计要求,提出了一种新的基16/8混合基算法及其实现结构。采用单个基16/8复用的蝶形运算单元顺序处理,并通过减少乘法器数目,有效降低了硬件消耗;运算单元内部采用“基4+基4/2”级联流水线方式,大大加快了运算速度;此外,应用对称乒乓RAM结构提高了蝶算单元的连续运算能力;并且使用改进的块浮点防溢出机制,以保证运算精度。仿真和实现结果表明该设计具有良好的性能,完全满足实际应用要求。  相似文献   

We present a novel standard convolutional symbols generator (SCSG) block for a multi-parameter reconfigurable Viterbi decoder to optimize resource consumption and adaption of multiple parameters. The SCSG block generates all the states and calculates all the possible standard convolutional symbols corresponding to the states using an iterative approach. The architecture of the Viterbi decoder based on the SCSG reduces resource consumption for recalculating the branch metrics and rearranging the correspondence between branch metrics and transition paths. The proposed architecture supports constraint lengths from 3 to 9, code rates of 1/2, 1/3, and 1/4, and fully optional polynomials. The proposed Viterbi decoder has been implemented on the Xilinx XC7VX485T device with a high throughput of about 200 Mbps and a low resource consumption of 162k logic gates.  相似文献   

研究一种基于现场可编程门阵列实现的高速脉冲压缩处理的硬件结构。设计通用的蝶形处理单元,使其在脉冲压缩处理的3个阶段都能使用,实现了硬件的共享,提高了硬件资源的利用效率。通过可使用原位运算的并行存储器结构,使得每个时钟周期均可完成一次蝶形运算,极大地提高了处理速度。采用块浮点处理单元,兼顾定点的高速率和浮点的高精度。经过实践验证,时钟在100 MHz时完成4 096点的脉冲压缩的时间为140 μs。  相似文献   

陶金  李林森 《微机发展》2006,16(6):116-118
针对无线城域网中工作在2GHz~11GHz频带的IEEE802.16a标准,在实现其OFDM系统时提出一种高速而且经济的FFT处理器设计方案。设计中采用了Radix-4的频率抽取算法和并行的蝶型计算单元结构,而且将旋转因子预先存储在ROM中以提高处理器运行的速度。设计方案采用了单个蝶型运算单元以达到控制FFT处理器规模的目的。数据的输入与输出都共用一个存储器,这进一步节约了硬件资源损耗。  相似文献   

基于Montgomery模乘的RSA加密处理器   总被引:1,自引:1,他引:0  
薛念  潘赟  张宇弘  严晓浪 《计算机工程》2010,36(13):125-127
提出一种基4的Montgomery模乘算法及优化的硬件结构,将传统基2模乘运算迭代次数减少近一半。在该模乘模块基础上设计高速RSA加密处理器,采用进位保留形式的全并行模幂运算流程,避免长进位链和中间结果转换的问题。结果表明,该设计同时适应FPGA和ASIC实现,完成一次标准1 024位RSA加密运算仅需9 836个周期,加密速率提高50%以上。  相似文献   

针对基-2 FFT 处理算法,采用分块存储思想,将存储器、处理机数据交换网络模型进行优化。优化后的网络模型数据通路数仅为20,降低为原来的4%以下,且不随 FFT 计算点数增多而增加。整个设计在 Virtex 系统芯片 XCV800上实现。  相似文献   

以IEEE 754标准为基础,完成了双精度浮点除法器的设计.整个设计包括预处理、指数减、尾数除、规格化、舍入判断、溢出判断和异常处理六部分.在尾数除部分用了SRT基4算法和改进的全并行基4、基8、基16和基256这5种不同的除法算法来实现.并分析了仿真和逻辑综合的结果,它们各自有不同的优点,可以适用不同的场合.如果综合考虑时钟周期数、时延、面积等方面的因素,全并行基8和基16算法是比较理想的选择.  相似文献   

We design a 3-bit adder or a radix-8 full adder (FA) in quantum-dot cellular automata (QCA), where the 3-bit carry propagation path can be accommodated in one clock-zone. To achieve this, we introduce group majority signals similar to group propagate and generate signals in parallel prefix computations, use them to reformulate the carry expressions of a previous radix-4 FA, and as such we could extend it to higher radix FAs. Applying the aforementioned new interpretation of carry expressions (via group majority signals) on 3-bit adders, results in that only a single clock cycle is required for 12-bit (vs. the previous 8-bit) carry propagation, across four radix-8 FAs. Based on the proposed radix-8 QCA-FA, we realized 8-, 16-, 32-, 64, and 128-bit QCA adders via QCADesigner. Comparison of these adders with the previous radix-4 experiment, showed 9–41% speed up, and 57–76% area saving, for 16–128-bit adders, respectively. On the other hand, compared to the best previous radix-2 design, for the same bit widths, we experienced 57–172% speed up, but at the cost of 138–4% area increase, except for the 64 and 128-bit cases, where we also experienced 19% and 41% area saving, respectively.  相似文献   

Floating-point fast Fourier transform (FFT) has been widely expected in scientific computing and high-resolution imaging applications due to the wide dynamic range and high processing precision. However, it suffers high area and energy overhead problems in comparison to fixed-point implementations. To address these issues, this paper presents an area- and energy-efficient hybrid architecture for floating-point FFT computations. It minimizes the required arithmetic units and reduces the memory usage significantly by combining three different parts. The serial radix-4 butterfly (SR4BF) is used in the single-path delay commutator (SDC) part to minimize the required arithmetic units with 100% adder utilization ratio obtained. A modified single-path delay feedback (MSDF) architecture is proposed to achieve a tradeoff between arithmetic resources and memory usage by using the new half radix-4 butterfly (HR4BF) with 50% adder utilization ratio obtained. The intermediate caching buffer is modified accordingly in the MSDF part. By combining both the advantages on arithmetic units reducing and memory usage optimization in different parts, the optimized area and power are obtained without throughput loss. The logic synthesis results in a 65 nm CMOS technology show that the energy per FFT is about 331.5 nJ for 1024-point FFT computations at 400 MHz. The total hardware overhead is equivalent to 460k NAND2 gates.  相似文献   

基于基为4的Montgomery模乘算法和改进的流水线组织结构,文章提出了一种结构优化的可扩展模乘运算器结构。设计中采用了按字运算的模乘算法,使本设计具有很好的可扩展性,它可以完成任意位数的模乘运算。同时,因为模乘运算器的运算数据通路采用多级处理单元的流水线结构,所以设计时可以很方便进行配置,以达到模乘运算器硬件成本和运算性能的折衷。分析结果显示,文章提出的模乘运算器结构具有很高的效率和很好的可扩展性。  相似文献   

李明奇  施国琛  黄德胜  邓有光 《软件学报》2001,12(10):1447-1463
无适当使用软件测量将可能引起软件低品质且高成本的窘态.凝聚力是软件品质重要因子之一如同维护度,可靠度和再利用度.软件模组品质的优劣必影响整体系统之品质的优劣.为了设计和维护高品质软件,软件专案经理人和软件工程师无可避免需引用软件凝聚力测量以衡量和产生高品质软件.提出以活路跃变量及视觉化变量纵距为分析基础之功能导向凝聚力测量方法.进而,以一系列实际案例来作实验验证,并以一组性质来作理论辩证所提的测量方法.因此一经完善定义,完善实验和完善辩证之凝聚力测量方法被提出用于当软件凝聚力强度的指标和因此增进软件品质.这凝聚力测量方法能容易嵌入CASE以帮助软件工程师确保软件品质.  相似文献   

为解决信道译码在高吞吐量通信系统中的瓶颈问题,通过对CUDA并行计算的了解和对维特比译码并行实现的探索,为卷积码提出了一种基于CUDA的截断重叠维特比译码器。算法通过截断式的子网格图相互重叠的方式,并行执行独立的正向度量计算和回溯过程。实验结果表明,在保证了译码算法误码率性能的同时,获得了良好的吞吐量提升表现,相比现有的实现方式有1.3~3.5倍的提升,降低了硬件开销,能够有效运用于实际高吞吐量通信系统中。  相似文献   

Signal-processing modules working directly on encrypted data provide an elegant solution to application scenarios where valuable signals must be protected from a malicious processing device. In this paper, we investigate the implementation of the discrete Fourier transform (DFT) in the encrypted domain by using the homomorphic properties of the underlying cryptosystem. Several important issues are considered for the direct DFT: the radix-2 and the radix-4 fast Fourier algorithms, including the error analysis and the maximum size of the sequence that can be transformed. We also provide computational complexity analyses and comparisons. The results show that the radix-4 fast Fourier transform is best suited for an encrypted domain implementation in the proposed scenarios.   相似文献   

很多基于椭圆曲线的密码协议如ECDSA签名验证,都需要计算多标量乘法kP IQ。目前常见的多标量乘算法有:Shamir多标量乘算法,interleaving多标量乘算法等,它们的效率主要取决于标量的(联合)海明权值。但它们都是基于radix-2编码表示的,无论采用何种编码,倍点运算的次数都不变,减少的只是点加(或点减)运算的次数。提出一个基于radix-4表示的新的编码方法,并给出一个基于radix-4表示的多标量乘算法,通过用四倍点运算代替倍点运算,且编码是从左到右(即从最高位向最低位)进行,编码和主计算可以合并,提高实现效率并节省内存空间。  相似文献   

The new Mersenne number transform (NMNT) has proved to be an important number theoretic transform (NTT) used for error-free calculation of convolutions and correlations. Its main feature is that for a suitable Mersenne prime number (p), the allowed power-of-two transform lengths can be very large. In this paper, efficient radix-22 decimation-in-time and in-frequency algorithms for fast calculation of the NMNT are developed by deriving the appropriate mathematical relations in finite field and applying principles of the twiddle factor unscrambling technique. The proposed algorithms achieve both the regularity of radix-2 algorithm and the efficiency of radix-4 algorithm and can be applied to any powers of two transform lengths with simple bit reversing for ordering the output sequence. Consequently, the proposed algorithms possess the desirable properties such as simplicity and in-place computation. The validity of the proposed algorithms has been verified through examples involving large integer multiplication and digital filtering applications, using both the NMNT and the developed algorithms.  相似文献   

易清明  谢胜利 《微计算机信息》2007,23(30):221-222,183
提出了Viterbi译码算法的一种矩阵描述方法;基于该矩阵描述设计了一种双向并行结构的Viterbi译码器,并根据正反向状态转移矩阵规律,以状态机实现正反向状态转移矩阵,有效地降低了存储资源的消耗,提高了译码运算速度;并进一步通过对累计度量值和幸存路径信息的优化,减少了约一半数据存储量;采用UMC0.18um工艺进行了综合与验证,综合结果表明在门级规模及译码速度两方面达到了极好的优化效果,可以更好地满足移动通信系统低功耗及实时性的应用需求。  相似文献   

The goal of the PMS project is to produce an environment in which the intelligent online assessment of the design for large-scale ADA programming projects is provided. The focus is on the representation of knowledge about the design process for an individual module. Changes in pseudocode complexity are measured in terms of partial metrics. These metrics can take the designers inferences about the pseudocode program structure into account when assessing module complexity. Next, a model of the stepwise refinement process is given which demonstrates how pseudocode elaboration decisions can be modelled in partial metric terms. Finally, the decisions associated with each refinement step for 17 example refinements taken from the computer science literature are described using partial metrics.  相似文献   

李锐  郑建汉 《微计算机信息》2007,23(32):92-93,115
基于对传统Viterbi译码器的分析和对改进的Viterbi算法理论的修正,提出了一种新的Viterbi译码器的实现方法。通过对路径度量值的深入分析和对回溯信息的重新编码,在不增加硬件实现复杂度的情况下减少了硬件规模,提高了译码速度。最后我们给出了该译码器的仿真波形。  相似文献   

徐卓  王雪静  叶凡  任俊彦 《计算机工程》2008,34(18):117-119
提出一种应用于多波段正交频分复用(MB-OFDM)超宽带通信系统的维特比解码器的设计方案,分析MB-OFDM所采用的卷积/凿孔码及相应的维特比解码算法的性能。为了达到系统要求的最高数据传输率、保持硬件开销的经济性,结合滑动窗口和折叠2种方法设计解码器的硬件结构。在低速工作模式下,部分处理单元被禁用,以节省功耗。该设计经Xilinx Virtex-4 FPGA验证,最高译码速率可达432 Mb/s。  相似文献   

Reliability engineering implemented early in the development process has a significant impact on improving software quality. It can assist in the design of architecture and guide later testing, which is beyond the scope of traditional reliability analysis methods. Structural reliability models work for this, but most of them remain tested in only simulation case studies due to lack of actual data. Here we use software metrics for reliability modeling which are collected from source codes of post versions. Through the proposed strategy, redundant metric elements are filtered out and the rest are aggregated to represent the module reliability. We further propose a framework to automatically apply the module value and calculate overall reliability by introducing formal methods. The experimental results from an actual project show that reliability analysis at the design and development stage can be close to the validity of analysis at the test stage through reasonable application of metric data. The study also demonstrates that the proposed methods have good applicability.   相似文献   

