期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Relaxed Annihilation-Reordering Look-Ahead QRD-RLS Adaptive Filters

Lijun Gao Keshab K. Parhi Jun Ma 《The Journal of VLSI Signal Processing》2003,35(2):119-135

The optimum architecture design and mapping of QRD-RLS adaptive filters can be achieved through filter architecture selections, look-ahead transformations, and hierarchical pipelining/folding transformations. In this paper, a relaxed annihilation-reordering look-ahead (RARL) architecture is proposed, and shown to be more power and area efficient than pipelined processing architecture which was considered the most area efficient. The filters with this architecture are based on relaxed weight-update through filtering approximation, where a filter tap weight is updated upon arrival of every block of input data, and are speeded up with annihilation-reordering look-ahead transformation. As a result of the computational complexity reduction, this architecture does not change the iteration bound and filter clock frequency, and leads to speed up with linear increase in power consumption, while the pipelined processing architectures result in speedup with quadratic increase in power consumption. Upon hardware mapping, this architecture is also more advantageous to achieve low area designs. Two design examples are presented to illustrate mapping optimization using above transformations. These results are important for mapping designs onto ASICs, FPGAs or parallel computing machines. The results show significant improvements in throughput, power consumption and hardware requirement. It is also interesting to show through mathematics and simulations that the RARL QRD-RLS filters have no performance degradation in terms of convergence rate. 相似文献

2.

LMS adaptive filters using distributed arithmetic for high throughput 总被引：1，自引：0，他引：1

Allred D.J. Heejong Yoo Krishnan V. Huang W. Anderson D.V. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(7):1327-1337

We present a new hardware adaptive filter architecture for very high throughput LMS adaptive filters using distributed arithmetic (DA). DA uses bit-serial operations and look-up tables (LUTs) to implement high throughput filters that use only about one cycle per bit of resolution regardless of filter length. However, building adaptive DA filters requires recalculating the LUTs for each adaptation which can negate any performance advantages of DA filtering. By using an auxiliary LUT with special addressing, the efficiency and throughput of DA adaptive filters can be of the same order as fixed DA filters. In this paper, we discuss a new hardware adaptive filter structure for very high throughput LMS adaptive filters. We describe the development of DA adaptive filters and show that practical implementations of DA adaptive filters have very high throughput relative to multiply and accumulate architectures. We also show that DA adaptive filters have a potential area and power consumption advantage over digital signal processing microprocessor architectures. 相似文献

3.

A high-performance VLSI architecture for reconfigurable FIR using distributed arithmetic

《Integration, the VLSI Journal》2016

In this paper, we have analyzed the register complexity of direct-form and transpose-form structures of FIR filter and explored the possibility of register reuse. We find that direct-form structure involves significantly less registers than the transpose-form structure, and it allows register reuse in parallel implementation. We analyze further the LUT consumption and other resources of DA-based parallel FIR filter structures, and find that the input delay unit, coefficient storage unit and partial product generation unit are also shared besides LUT words when multiple filter outputs are computed in parallel. Based on these finding, we propose a design approach, and used that to derive a DA-based architecture for reconfigurable block-based FIR filter, which is scalable for larger block-sizes and higher filter-lengths. Interestingly, the number of registers of the proposed structure does not increase proportionately with the block-size. This is a major advantage for area-delay and energy efficient high-throughput implementation of reconfigurable FIR filters of higher block-sizes. Theoretical comparison shows that the proposed structure for block-size 8 and filter-length 64 involves 60% more flip-flops, 6.2 times more adders, 3.5 times more AND-OR gates, and offers 8 times higher throughput. ASIC synthesis result shows that the proposed structure for block-size 8 and filter-length 64 involves 1.8 times less area-delay product (ADP) and energy per sample (EPS) than the existing design, and it can support 8 times higher throughput. The proposed structure for block sizes 4 and 8, respectively, consumes 38% and 50% less power than the exiting structure for the same throughput rates on average for different supply voltages. 相似文献

4.

Novel sorting network-based architectures for rank order filters

Chakrabarti C. Li-Yu Wang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1994,2(4):502-507

This paper presents two novel sorting network-based architectures for computing high sample rate nonrecursive rank order filters. The proposed architectures consist of significantly fewer comparators than existing sorting network-based architectures that are based on bubble-sort and Batcher's odd-even merge sort. The reduction in the number of comparators is obtained by sorting the columns of the window only once, and by merging the sorted columns in a way such that the number of candidate elements for the output is very small. The number of comparators per output is reduced even further by processing a block of outputs at a time. Block processing procedures that exploit the computational overlap between consecutive windows are developed for both the proposed networks 相似文献

5.

最优结构元约束层叠滤波器分析与设计 总被引：3，自引：1，他引：2

下载免费PDF全文

孙圣和王伟赵春晖《电子学报》2000,28(2):7-10

在对信号阈值分解基础上,利用结构化方法结合最优估计理论,对最优结构元约束层叠滤波器进行建模和分析,证明了最优结构元约束层叠滤波器实质是一类由多个极大/极小滤波单元组成的多级秩排序滤波器,并给出基于层叠处理操作和多级秩排序操作的滤波器实现结构.最后,结合图像处理应用实例,与其它传统多级秩排序滤波器进行了比较,证明了本文滤波器的有效性. 相似文献

6.

Novel Area-Efficient FPGA Architectures for FIR Filtering With Symmetric Signal Extension

Benkrid Abd.S. Benkrid K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(5):709-722

This paper presents four novel area-efficient field-programmable gate-array (FPGA) bit-parallel architectures of finite impulse response (FIR) filters that smartly support the technique of symmetric signal extension while processing finite length signals at their boundaries. The key to this is a clever use of variable-depth shift registers which are efficiently implemented in Xilinx FPGAs in the form of shift register logic (SRL) components. Comparisons with the conventional architecture of FIR filter with symmetric boundary processing show considerable area saving especially with long-tap filters. For instance, our architecture implementation of the 8-tap low Daubechies-8 FIR filter achieves ~ 30% reduction in the area requirement (in terms of slices) compared to the conventional architecture while maintaining the same throughput. Two of the above-cited novel architectures are dedicated to the special case of symmetric FIR filters. The first architecture is highly area-efficient but requires a clock frequency doubler. While this reduces the overall processing speed (to a maximum of 2), it does maintain a high throughput. Moreover, this speed penalty is cancelled in bi-phase filters which are widely used in multirate architectures (e.g., wavelets). Our second symmetric FIR filter architecture saves less logic than the first architecture (e.g., 10% with the 9-tap low Biorthogonal 9&7 symmetric filter instead of 37% with the first architecture) but overcomes its speed penalty as it matches the throughput of the conventional architecture. 相似文献

7.

Design techniques for silicon compiler implementations ofhigh-speed FIR digital filters

Hawley R.A. Wong B.C. Thu-Ji Lin Laskowski J. Samueli H. 《Solid-State Circuits, IEEE Journal of》1996,31(5):656-667

Architecture design techniques for implementing both single-rate and multirate high throughput finite impulse response (FIR) digital filters are explored, with an emphasis on those which are applicable to automated integrated circuit layout techniques. Various parallel architectures are examined based on the criteria of achievable throughput versus hardware complexity. Well-known techniques for reduced complexity and computation time are briefly summarized, followed by the introduction of several new techniques which offer further gains in both throughput and circuitry reduction. An architecture for mirror-symmetric polyphase filter banks is derived which exploits the coefficient symmetry between multiple filters to reduce hardware. Finally, the evolution of a silicon compiler which utilizes all of these techniques is presented, and results are given for compiled filters along with comparisons to other compiled and custom FIR filter chips 相似文献

8.

Second-order OTA-C filters derived from Nawrocki-Klein biquad 总被引：3，自引：0，他引：3

Yichuang Sun 《Electronics letters》1998,34(15):1449-1450

Based on the Nawrocki and Klein biquad and using different input and output techniques, many new filter structures and functions are developed from both voltage- and current-mode viewpoints. For example, it is found that the basic five operational transconductance amplifier (OTA) structure has the LP, BP, HP, BS and AP functions without the use of capacitor injection and component matching, a feature shared by no other existing two integrator loop OTA-capacitor (OTA-C) circuit, including the original NK biquad. Various BS, LPN, HPN, AP and universal architectures are obtained, which require fewer OTAs and may be more useful than the original biquad. This research shows that the NK biquad is one of the most important second-order OTA-C filters at least from the viewpoint of filter architectures and functions, and thus should receive attention in continuous-time filter design 相似文献

9.

Efficient VLSI array processing structures for adaptive quadratic digital filters

Yuang Lou Chrysostomos L. Nikias Anastasios N. Venetsanopoulos 《Circuits, Systems, and Signal Processing》1988,7(2):253-273

In this paper we introduce a class of efficient architectures for adaptive quadratic digital filters. These architectures are based on the LMS algorithm and use the rank compressed lower-upper (LU) triangular deomposition method. These architectures exhibit high parallelism as well as great modularity and regularity. We also consider affiliated VLSI array processing structures and compare these in terms of hardware cost and data throughput delay. For comparison purposes, the distributed arithmetic structures of adaptive quadratic filters are also included in the paper. Finally, the convergence performance of the adaptive quadratic filters is tested via benchmark simulation examples.This work was supported by National Science Foundation Grant ECS-8601307. 相似文献

10.

Stack filters and selection probabilities 总被引：3，自引：0，他引：3

Prasad M.K. Lee Y.H. 《Signal Processing, IEEE Transactions on》1994,42(10):2628-2643

Based on the fact that the output of a given stack filter can be determined if the ranks of the samples in the input window are known and that this output always equals one of the samples in the input window, rank and sample selection probabilities are defined. The output distribution of the stack filter of size N with independent identically distributed (i.i.d.) inputs can be expressed as a weighted sum of the ith, i=1, 2, ..., N order statistics, where the rank selection probabilities are the weights. The sample selection probabilities equal the impulse response coefficients of a finite impulse response (FIR) filter whose output spectrum is closest, of all linear filters, to that of the stack filter for i.i.d. Gaussian inputs. Results are also derived for correlated inputs. Robustness and detail preserving properties of stack filters are related to the selection probabilities. Other statistical properties are also derived. Finally, methods to compute the selection probabilities of the stack filter from its positive Boolean function and the selection probabilities of the weighted median filter from its weights are given in detail 相似文献

11.

New high-order filter structures using only single-ended-input OTAs and grounded capacitors

Chun-Ming Chang Al-Hashimi B.M. Yichuang Sun Ross J.N. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2004,51(9):458-463

Despite the wealth of literature on operational transconductance amplifier (OTA)-C filters, the synthesis of high-order filter characteristics is still an active topic. In this paper the realization of voltage transfer functions based on canonical current-mode follow-the-leader-feedback (FLF) OTA-C structures are investigated. Two new structures are presented, which use only single-ended-input OTAs and grounded capacitors. The first structure has a single voltage input and multiple voltage outputs taken from different nodes, which enables it to provide simultaneous outputs of different filter functions. The second structure has a single voltage output and single voltage input distributed to different circuit nodes for a universal realization. The authors not only propose such filter structures, but also show how analytical synthesis can be used to produce filter circuits that have less active elements than those recently reported voltage-mode structures which are based on differential-input OTAs. This represents another attractive feature from chip area, and power consumption point of view. Simulation results verifying the theoretical analysis of the proposed filter structure are included. 相似文献

12.

Field programmable analog arrays for implementation of generalized nth‐order operational transconductance amplifier‐C elliptic filters

Maha S. Diab Soliman A. Mahmoud 《ETRI Journal》2020,42(4):534-548

This study presents a new architecture for a field programmable analog array (FPAA) for use in low‐frequency applications, and a generalized circuit realization method for the implementation of nth‐order elliptic filters. The proposed designs of both the FPAA and elliptic filters are based on the operational transconductance amplifier (OTA) used in implementing OTA‐C filters for biopotential signal processing. The proposed FPAA architecture has a flexible, expandable structure with direct connections between configurable analog blocks (CABs) that eliminates the use of switches. The generalized elliptic filter circuit realization provides a simplified, direct synthetic method for an OTA‐C symmetric balanced structure for even/odd‐nth‐order low‐pass filters (LPFs) and notch filters with minimum number of components, using grounded capacitors. The filters are mapped on the FPAA, and both architectures are validated with simulations in LTspice using 90‐nm complementary metal‐oxide semiconductor (CMOS) technology. Both proposed FPAA and filters generalized synthetic method achieve simple, flexible, low‐power designs for implementation of biopotential signal processing systems. 相似文献

13.

Distributed Memory Parallel Architecture Based on Modular Linear Arrays for 2-D Separable Transforms Computation

José Fridman Elias S. Manolakos 《The Journal of VLSI Signal Processing》2001,28(3):187-203

A framework for mapping systematically 2-dimensional (2-D) separable transforms into a parallel architecture consisting of fully pipelined linear array stages is presented. The resulting model architecture is characterized by its generality, high degree of modularity, high throughput, and the exclusive use of distributed memory and control. There is no central shared memory block to facilitate the transposition of intermediate results, as it is commonly the case in row-column image processing architectures. Avoiding shared central memory has positive implications for speed, area, power dissipation and scalability of the architecture. The architecture presented here may be used to realize any separable 2-D transform by only changing the coefficients stored in the processing elements. Pipelined linear arrays for computing the 2-D Discrete Fourier Transform and 2-D separable convolution are presented as examples and their performance is evaluated. 相似文献

14.

Low-Area/Power Parallel FIR Digital Filter Implementations 总被引：4，自引：0，他引：4

David A. Parker Keshab K. Parhi 《The Journal of VLSI Signal Processing》1997,17(1):75-92

This paper presents a novel approach for implementing area-efficient parallel (block) finite impulse response (FIR) filters that require less hardware than traditional block FIR filter implementations. Parallel processing is a powerful technique because it can be used to increase the throughput of a FIR filter or reduce the power consumption of a FIR filter. However, a traditional block filter implementation causes a linear increase in the hardware cost (area) by a factor of L, the block size. In many design situations, this large hardware penalty cannot be tolerated. Therefore, it is important to design parallel FIR filter structures that require less area than traditional block FIR filtering structures. In this paper, we propose a method to design parallel FIR filter structures that require a less-than-linear increase in the hardware cost. A novel adjacent coefficient sharing based sub-structure sharing technique is introduced and used to reduce the hardware cost of parallel FIR filters. A novel coefficient quantization technique, referred to as a scalable maximum absolute difference (MAD) quantization process, is introduced and used to produce quantized filters with good spectrum characteristics. By using a combination of fast FIR filtering algorithms, a novel coefficient quantization process and area reduction techniques, we show that parallel FIR filters can be implemented with up to a 45% reduction in hardware compared to traditional parallel FIR filters. 相似文献

15.

A Practical Parallel Architecture for Stacks Filters

María J. Avedillo José M. Quintana Hamid El Alami Antonio Jiménez-Calderón 《The Journal of VLSI Signal Processing》2004,38(2):91-100

Stack filters belong to the class of non-linear filters and include the well-known median filter, weighted median filters, order statistic filters and weighted order statistic filters. Any stack filter can be implemented by using the parallel threshold decomposition architecture which allows implementing their non-linear processing by means of a collection of identical binary filters (Boolean logic circuits). Although it is conceptually simple and useful to study the filter properties, this architecture is not practical for direct hardware implementation because as many as (M – 1) binary filters are required for a M-valued input signal and M is large in many applications.In this paper we introduce a new parallel architecture for stack filter implementations. The complexity is now proportional to the window width L of the filter, instead of to M. In most applications L is much smaller than M which translates into efficient hardware implementations. The attractive characteristic of ease of design exhibited by the threshold decomposition architecture is kept. In fact, for a given stack filter both in the conventional implementation and in the proposed one, the same binary filter is required. The key concept supporting the new architecture is a modified decomposition scheme which generates L binary signals for a multi-valued input. As an application example, a complex WOS filter is designed and prototyped in an FPGA. 相似文献

16.

Efficient pipelined flow classification for intelligent data processing in IoT

Seyed Navid Mousavi Fengping Chen Mahdi Abbasi Mohammad R. Khosravi Milad Rafiee 《Digital Communications & Networks》2022,8(4):561-575

The packet classification is a fundamental process in provisioning security and quality of service for many intelligent network-embedded systems running in the Internet of Things (IoT). In recent years, researchers have tried to develop hardware-based solutions for the classification of Internet packets. Due to higher throughput and shorter delays, these solutions are considered as a major key to improving the quality of services. Most of these efforts have attempted to implement a software algorithm on the FPGA to reduce the processing time and enhance the throughput. The proposed architectures, however, cannot reach a compromise among power consumption, memory usage, and throughput rate. In view of this, the architecture proposed in this paper contains a pipeline-based micro-core that is used in network processors to classify packets. To this end, three architectures have been implemented using the proposed micro-core. The first architecture performs parallel classification based on header fields. The second one classifies packets in a serial manner. The last architecture is the pipeline-based classifier, which can increase performance by nine times. The proposed architectures have been implemented on an FPGA chip. The results are indicative of a reduction in memory usage as well as an increase in speedup and throughput. The architecture has a power consumption of is 1.294w, and its throughput with a frequency of 233 ?MHz exceeds 147 Gbps. 相似文献

17.

Parallel interleaver design and VLSI architecture for low-latency MAP turbo decoders

Dobkin R. Peleg M. Ginosar R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(4):427-438

Standard VLSI implementations of turbo decoding require substantial memory and incur a long latency, which cannot be tolerated in some applications. A parallel VLSI architecture for low-latency turbo decoding, comprising multiple single-input single-output (SISO) elements, operating jointly on one turbo-coded block, is presented and compared to sequential architectures. A parallel interleaver is essential to process multiple concurrent SISO outputs. A novel parallel interleaver and an algorithm for its design are presented, achieving the same error correction performance as the standard architecture. Latency is reduced up to 20 times and throughput for large blocks is increased up to six-fold relative to sequential decoders, using the same silicon area, and achieving a very high coding gain. The parallel architecture scales favorably: latency and throughput are improved with increased block size and chip area. 相似文献

18.

A Tutorial on Multiplierless Design of FIR Filters: Algorithms and Architectures

Levent Aksoy Paulo Flores José Monteiro 《Circuits, Systems, and Signal Processing》2014,33(6):1689-1719

Finite impulse response (FIR) filtering is a ubiquitous operation in digital signal processing systems and is generally implemented in full custom circuits due to high-speed and low-power design requirements. The complexity of an FIR filter is dominated by the multiplication of a large number of filter coefficients by the filter input or its time-shifted versions. Over the years, many high-level synthesis algorithms and filter architectures have been introduced in order to design FIR filters efficiently. This article reviews how constant multiplications can be designed using shifts and adders/subtractors that are maximally shared through a high-level synthesis algorithm based on some optimization criteria. It also presents different forms of FIR filters, namely, direct, transposed, and hybrid and shows how constant multiplications in each filter form can be realized under a shift-adds architecture. More importantly, it explores the impact of the multiplierless realization of each filter form on area, delay, and power dissipation of both custom (ASIC) and reconfigurable (FPGA) circuits by carrying out experiments with different bitwidths of filter input, design libraries, reconfigurable target devices, and optimization criteria in high-level synthesis algorithms. 相似文献

19.

Stack filter design: a structural approach 总被引：1，自引：0，他引：1

Lin Yin 《Signal Processing, IEEE Transactions on》1995,43(4):831-840

A new approach is developed for finding the optimal stack filter that minimizes noise subject to constraints on its structural behavior. Based on the output moments of stack filters, it is proven that the optimal stack filter is a combination of the median filter, which has the same window width as the stack filter, and a set of maximum and minimum filters, which are attributed to the structural constraints. Design examples for 1-D signal processing and image processing are provided 相似文献

20.

基于MOCCII多输入单输出n阶电流模式滤波器 总被引：16，自引：0，他引：16

王春华沈光地《通信学报》2004,25(2):138-143

提出了一种结构简单的基于MOCCⅡ n阶多输入单输出电流模式滤波器电路。该滤波器电路包含，n 1个有源器件、n个电容及，n 2个电阻，可以产生n阶低通、带通、高通、带阻电流模式滤波器。由于仅依靠改变外部输入电流信号的接入数目和方式来实现不同功能的滤波器，而电路内部结构及器件数目不变，所以该电路便于单片集成。文中对滤波器进行了PSPICE模拟。相似文献