首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
该文基于快速卷积算法,提出一种适用于线性相位FIR滤波器的并行结构。该结构采用快速卷积算法减少子滤波器个数,同时让尽可能多的子滤波器具有对称系数,然后利用系数对称的特性减少子滤波器模块中的乘法器数量。对于具有对称系数的FIR滤波器,提出的并行结构能够比已有的并行FIR结构节省大量的硬件资源,尤其当滤波器的抽头数较大时效果更明显。具体地,对一个4并行144抽头的FIR滤波器,提出的结构比改进的快速FIR算法(Fast FIR Algorithm, FFA)结构节省36个乘法器(14.3%),23个加法器(6.6%)和35个延时单元(11.0%)。  相似文献   

2.
Low-Area/Power Parallel FIR Digital Filter Implementations   总被引:4,自引:0,他引:4  
This paper presents a novel approach for implementing area-efficient parallel (block) finite impulse response (FIR) filters that require less hardware than traditional block FIR filter implementations. Parallel processing is a powerful technique because it can be used to increase the throughput of a FIR filter or reduce the power consumption of a FIR filter. However, a traditional block filter implementation causes a linear increase in the hardware cost (area) by a factor of L, the block size. In many design situations, this large hardware penalty cannot be tolerated. Therefore, it is important to design parallel FIR filter structures that require less area than traditional block FIR filtering structures. In this paper, we propose a method to design parallel FIR filter structures that require a less-than-linear increase in the hardware cost. A novel adjacent coefficient sharing based sub-structure sharing technique is introduced and used to reduce the hardware cost of parallel FIR filters. A novel coefficient quantization technique, referred to as a scalable maximum absolute difference (MAD) quantization process, is introduced and used to produce quantized filters with good spectrum characteristics. By using a combination of fast FIR filtering algorithms, a novel coefficient quantization process and area reduction techniques, we show that parallel FIR filters can be implemented with up to a 45% reduction in hardware compared to traditional parallel FIR filters.  相似文献   

3.
钟文斌  周志刚  王丽云  李超 《电讯技术》2013,53(9):1223-1228
为实现E-band(E频段)通信系统中的高速成形滤波,在已有快速FIR滤波算法(FFA)基础上,通过快速短卷积迭代以及张量展开算法,设计了一种高速并行FIR成形滤波器,并进行硬件复杂度分析与时延分析。浮点和定点数仿真验证结果表明,所设计高速并行滤波器在硬件实现上可减少21%的乘法运算操作和1314%的时延单元,6比特以上小数量化可达到系统成形滤波需求。  相似文献   

4.
Based on recently published low-complexity parallel finite-impulse response (FIR) filter structures, this paper proposes a new parallel FIR Filter structure with less hardware complexity. The subfilters in the previous parallel FIR structures are replaced by a second stage parallel FIR filter. The proposed 2-stage parallel FIR filter structures can efficiently reduce the number of required multiplications and additions at the expense of delay elements. For a 32-parallel 1152-tap FIR filter, the proposed structure can save 5184 multiplications (67%), 2612 additions (30%), compared to previous parallel FIR structures, at the expense of 10089 delay elements (-133%). The proposed structures will lead to significant hardware savings because the hardware cost of a delay element is only a small portion of that of a multiplier, not including the savings in the number of additions  相似文献   

5.
对于高阶FIR滤波器,由于运算量较大,采用软件等方式无法达到实时处理的要求。文中提出了采用FPGA实现快速卷积结构的高阶FIR滤波器,推导出将大点数FFT分解为二维FFT变换的公式。根据上述理论在采用Verilog HDL语言设计了基于一维转二维FFT的快速卷积结构高阶FIR滤波器。实验表明,该基于FPGA的高阶FIR滤波器具有精度高、速度快、资源消耗少、调试方便、易于集成等优点,并可达到工程实践的要求。  相似文献   

6.
为满足太赫兹无线通信系统对大容量基带信号处理算法的要求,基于直接从多项式分解导出的传统滤波器并行实现算法,通过矩阵变化推导出复杂度更小的快速有限冲激响应(FIR)滤波器并行实现。在此基础上通过张量积的表示给出了2并行、4并行和8并行的转换公式以及实现架构。既而推导出2N并行快速FIR滤波器的通用实现公式,并对比了优化前后的复杂度差异。最后给出了64并行的快速FIR滤波器的推导公式和具体实现架构,以及优化前后的硬件复杂度对比,64并行的快速FIR滤波器算法资源消耗更少。  相似文献   

7.
FIR陷波滤波器具有线性相位、精度高、稳定性好等诸多优势,然而当陷波性能要求较高时,通常需要较高的阶数,导致FIR陷波滤波器硬件实现复杂度大大提高。该文基于稀疏FIR滤波器设计算法和共同子式消除的思想,提出一种低复杂度的FIR陷波滤波器设计方法。该方法首先采用稀疏滤波器设计算法得到满足频域性能设计要求的FIR陷波原始滤波器系数,然后对其进行CSD编码,并分析CSD编码量化系数集中所有的2项子式和孤子的灵敏度,最后根据灵敏度的大小依次选择合理的2项子式或孤子直接合成滤波器系数集。仿真结果表明,新算法设计实现的FIR陷波滤波器比已有的低复杂度设计方法最多可减少51%的加法器,有效地降低了硬件实现复杂度,大大节省了硬件资源。  相似文献   

8.
基于FPGA高阶FIR滤波器的实现   总被引:1,自引:1,他引:0  
从FIR数字滤波器的基本结构模型出发,分析了FIR滤波器的设计思路及具体实现方法,详细介绍了FIR滤波器的分布式算法(DA)结构。通过分析计算,得到普通DA结构实现高阶滤波器会消耗大量的查找表资源,这样的资源消耗甚至令硬件资源不可接受。针对普通DA的不足,提出了改进型DA结构。并利用FPGA仿真软件分别对64阶FIR带通滤波器的两种改进型DA结构进行仿真,结果表明改进型DA结构所消耗的资源大幅度降低。从而验证了改进型DA结构在降低运算资源和提高性能等方面的优越性。  相似文献   

9.
FIR滤波器的FPGA实现方法   总被引:1,自引:1,他引:0  
为了给实际应用中选择合适FIR滤波器的FPGA实现结构提供参考,首先从FIR数字滤波器的基本原理出发,分析了FIR滤波器的结构特点,然后分别介绍了基于FPGA的FIR滤波器的串行、并行、转置型、FFT型和分布式结构型的实现方法,对于各种实现的结构做了分析、比较以及优化处理,特别是对基于FFT的FIR滤波器与传统卷积结构进行了精确的数值计算比较,最后得出满足于低阶或高阶的各种FIR滤波器实现结构的适用范围及其优缺点,并针对实际工程应用提出了下一步需解决的问题。  相似文献   

10.
尚勇  罗丰  吴顺君 《电子与信息学报》2001,23(11):1041-1045
近年来,小波变换得到了广泛的应用,快速塔式分解算法是它应用的一个有利工具,其地位相当于FFT之于Fourier分析,因此DWT的快速硬件实现变成了其应用的一个重要问题,本文通过将并行systolic FIR滤波器结构引入小波分解滤波器的设计,得到了一种小波分解滤波器的实现结构。该结构由于应用了systolic技术及采用并行结构,除了可以提高运算速度外,还可以提高系统的数据吞吐率以及降低系统功耗。  相似文献   

11.
This paper proposes a fast convergence algorithm for sparse-tap adaptive finite impulse response (FIR) filters to identify an unknown number of multiple dispersive regions. Coefficient values and tap-positions of the adaptive filter are simultaneously controlled. A constrained region for new-tap positions is selected from equisize subgroups of all possible tap-positions, and it hops from one subgroup to another to cover multiple dispersive regions. The hopping order and the stay time for each subgroup are adaptively determined based on the absolute coefficient values. Simulation results with colored signals show that the proposed algorithm saves more than 80% in the convergence time over the full-tap NLMS and 50% over the STWQ. Tracking capability of the proposed algorithm exhibits its superior characteristics. These characteristics are confirmed by hardware evaluations with a telephone network simulator.  相似文献   

12.
The existing derivations of conventional fast RLS adaptive filters are intrinsically dependent on the shift structure in the input regression vectors. This structure arises when a tapped-delay line (FIR) filter is used as a modeling filter. We show, unlike what original derivations may suggest, that fast fixed-order RLS adaptive algorithms are not limited to FIR filter structures. We show that fast recursions in both explicit and array forms exist for more general data structures, such as orthonormally based models. One of the benefits of working with orthonormal bases is that fewer parameters can be used to model long impulse responses  相似文献   

13.
脉冲数字成型滤波器属于有限冲激响应(FIR)滤波器的一种,常规做法是通过传统的乘累加(MACs)方法来实现,即通过对输入信号与单位冲激响应进行线性卷积。但是,随着成型滤波器系数的增加,这种卷积运算势必会占用大量的MAC单元以及延迟单元,导致现场可编程门阵列(FPGA)硬件资源紧张,系统延迟增大,设备成本增加。本文联合了FIR成型滤波器群延时特征以及基带数字调制符号特性,提出了一种新的查找表(LUT)结构的FIR滤波方法,并且在FPGA上实现。软硬件仿真结果表明,这一方法无论从精确度和资源利用上都具有一定的优势。  相似文献   

14.
Low-Cost Fast VLSI Algorithm for Discrete Fourier Transform   总被引:1,自引:0,他引:1  
A primeN-length discrete Fourier transform (DFT) can be reformulated into a (N-1)-length complex cyclic convolution and then implemented by systolic array or distributed arithmetic. In this paper, a recently proposed hardware efficient fast cyclic convolution algorithm is combined with the symmetry properties of DFT to get a new hardware efficient fast algorithm for small-length DFT, and then WFTA is used to control the increase of the hardware cost when the transform length Nis large. Compared with previously proposed low-cost DFT and FFT algorithms with computation complexity of O(logN), the new algorithm can save 30% to 50% multipliers on average and improve the average processing speed by a factor of 2, when DFT length Nvaries from 20 to 2040. Compared with previous prime-length DFT design, the proposed design can save large amount of hardware cost with the same processing speed when the transform length is long. Furthermore, the proposed design has much more choices for different applicable DFT transform lengths and the processing speed can be flexible and balanced with the hardware cost  相似文献   

15.
In this paper, we have analyzed the register complexity of direct-form and transpose-form structures of FIR filter and explored the possibility of register reuse. We find that direct-form structure involves significantly less registers than the transpose-form structure, and it allows register reuse in parallel implementation. We analyze further the LUT consumption and other resources of DA-based parallel FIR filter structures, and find that the input delay unit, coefficient storage unit and partial product generation unit are also shared besides LUT words when multiple filter outputs are computed in parallel. Based on these finding, we propose a design approach, and used that to derive a DA-based architecture for reconfigurable block-based FIR filter, which is scalable for larger block-sizes and higher filter-lengths. Interestingly, the number of registers of the proposed structure does not increase proportionately with the block-size. This is a major advantage for area-delay and energy efficient high-throughput implementation of reconfigurable FIR filters of higher block-sizes. Theoretical comparison shows that the proposed structure for block-size 8 and filter-length 64 involves 60% more flip-flops, 6.2 times more adders, 3.5 times more AND-OR gates, and offers 8 times higher throughput. ASIC synthesis result shows that the proposed structure for block-size 8 and filter-length 64 involves 1.8 times less area-delay product (ADP) and energy per sample (EPS) than the existing design, and it can support 8 times higher throughput. The proposed structure for block sizes 4 and 8, respectively, consumes 38% and 50% less power than the exiting structure for the same throughput rates on average for different supply voltages.  相似文献   

16.
一种基于FPGA的并行流水线FIR滤波器结构   总被引:5,自引:0,他引:5  
王黎明  刘贵忠  刘龙  刘洁瑜 《微电子学》2004,34(5):582-585,588
提出了一种在FPGA器件上实现的流水线并行FIR滤波器结构。首先比较了FIR滤波器三种硬件实现所用的资源,然后在理论上推出该流水线并行结构滤波器的实现方法及其可行性,给出了硬件实现模块。实验结果表明,这种改进滤波器结构实现的算法可以灵活地处理综合的面积和速度的约束关系,使设计达到最优。  相似文献   

17.
该文在对已有的拉格朗日立方插值滤波器Farrow结构进行分析和研究的基础上,使用了流水线技术和并行处理技术来提高滤波器的速度。在此基础上提出了基于快速FIR算法的结构,降低了并行的Farrow结构的复杂度。对该算法结构进行了仿真,并在FPGA上实现。分析结果表明,改进后的结构有更快的运行速度和更低的功耗。  相似文献   

18.
基于并行FIR滤波器结构的数字下变频   总被引:1,自引:0,他引:1  
对宽带信号进行并行处理,可同时满足低功耗和实时性的要求,已成为目前宽带信号处理的研究热点。本文提出了一种可在FPGA中实现的并行快速FIR滤波器设计方法。该方法通过应用并行多相处理技术中的一种新型分布式处理算法,在滤波器结构上实现了多级级联的形式,增强了中频处理的灵活性和通用性,节省了硬件开销。仿真结果表明,该算法很好的解决了原始低通滤波器速度跟不上A/D采样率的问题,把采样率提高到了320MHz以上。同时该方法应用软件实现并行信号处理,避免了使用DDC专用芯片,具有较强的通用性,可以很好的移植到其他CPLD中。  相似文献   

19.
为减少卷积神经网络(CNN)的计算量,该文将2维快速滤波算法引入到卷积神经网络,并提出一种在FPGA上实现CNN逐层加速的硬件架构。首先,采用循环变换方法设计行缓存循环控制单元,用于有效地管理不同卷积窗口以及不同层之间的输入特征图数据,并通过标志信号启动卷积计算加速单元来实现逐层加速;其次,设计了基于4并行快速滤波算法的卷积计算加速单元,该单元采用若干小滤波器组成的复杂度较低的并行滤波结构来实现。利用手写数字集MNIST对所设计的CNN加速器电路进行测试,结果表明:在xilinx kintex7平台上,输入时钟为100 MHz时,电路的计算性能达到了20.49 GOPS,识别率为98.68%。可见通过减少CNN的计算量,能够提高电路的计算性能。  相似文献   

20.
一种新的滤波器快速优化算法   总被引:2,自引:0,他引:2  
提出了一种新的FIR滤波器快速算法,显著降低了实时信号处理中的长阶数FIR滤波器的运算量。我们通过对任意长度序列添零、分解和参数优化,使运算量最小化。当对非连续信号进行滤波处理时,在一半情形下,相比原来的频域FIR快速算法,进一步显著降低了运算量。这一算法在实时信号处理中得到了验证。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号