期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

RNS-FPL merged architectures for orthogonal DWT

Ramirez J. Garcia A. Fernandez P.G. Patrilla L. Lloris A. 《Electronics letters》2000,36(14):1198-1199

Novel, regular, compact and easily scalable residue number system (RNS) field-programmable logic (FPL) merged architectures for the orthogonal 1D discrete wavelet transform (DWT) and 1D inverse discrete wavelet transform (1DWT) are presented. These structures halve the number of look-up tables (LUTs) required per octave, providing a sustained throughput independent of the input data and filter coefficient precision. They are suitable to be considered as the core of 2D DWT processors for high data rate image processing applications 相似文献

2.

Analysis of the convergence behavior of adaptivedistributed-arithmetic echo cancellers

Cherubini G. 《Communications, IEEE Transactions on》1993,41(11):1703-1714

Adaptive distributed-arithmetic echo cancellers are well suited for full-duplex high-speed data transmission. They allow a simpler implementation than adaptive linear transversal filters, since multiplications are replaced by table look-up and shift-and-add operations. Various tradeoffs between the number of operations and the number of memory locations of the look-up tables can be achieved by segmenting the echo canceller delay line into sections of shorter length. Adaptivity is achieved by a decision-directed stochastic gradient algorithm to adjust the contents of the look-up tables. The author adopts the mean-square error criterion to investigate the convergence behavior of adaptive distributed-arithmetic echo cancellers. Under the assumption that the look-up values are statistically independent of the symbols stored in the echo canceller delay line, he obtains an analytical expression for the mean-square error as a function of time. The maximum speed of convergence and the corresponding optimum adaptation gain are also determined. Simulation results for a full-duplex quaternary partial response class-IV system are presented and compared with theoretical results 相似文献

3.

Another contender in the arctangent race

Lyons R. 《Signal Processing Magazine, IEEE》2004,21(1):109-110

Fast and accurate methods for computing the arctangent of a complex number x = I + jQ have been the subject of extensive study because estimating the angle /spl theta/ of a complex value has so many applications in the field of signal processing. Practitioners interested in high-speed arctangent computations typically use look-up tables where the values of I and Q specify an address in read-only memory (ROM) containing an approximation of angle /spl theta/. In this paper, another method to compute the arctangent approximation is proposed that uses neither the look-up table nor high order polynomials. 相似文献

4.

非迭代查找表预失真新方法 总被引：1，自引：0，他引：1

詹鹏秦开宇蔡顺燕《电波科学学报》2011,(1)

提出一种新的非迭代查找表预失真方法,该方法采用改进型的查找表预失真器,根据输入的同相和正交(IQ)信号及查找表的值,经过乘加运算后就可直接得到预失真处理后的IQ两路输出,避免了坐标转换所带来的附加计算,使预失真器的结构更简单,计算量更低,并结合单路反馈技术,降低了预失真系统硬件成本。仿真结果证明:该方法是正确的,能很好的抑制带内和带外失真,达到了比较好的预失真线性化效果。相似文献

5.

A new algorithm for N-dimensional Hilbert scanning 总被引：3，自引：0，他引：3

Kamata S.-I. Eason R.O. Bandou Y. 《IEEE transactions on image processing》1999,8(7):964-973

There have been many applications of the Hilbert curve, such as image processing, image compression, computer hologram, etc. The Hilbert curve is a one-to-one mapping between N-dimensional space and one-dimensional (l-D) space which preserves point neighborhoods as much as possible. There are several algorithms for N-dimensional Hilbert scanning, such as the Butz algorithm and the Quinqueton algorithm. The Butz algorithm is a mapping function using several bit operations such as shifting, exclusive OR, etc. On the other hand, the Quinqueton algorithm computes all addresses of this curve using recursive functions, but takes time to compute a one to-one mapping correspondence. Both algorithms are complex to compute and both are difficult to implement in hardware. In this paper, we propose a new, simple, nonrecursive algorithm for N-dimensional Hilbert scanning using look-up tables. The merit of our algorithm is that the computation is fast and the implementation is much easier than previous ones. 相似文献

6.

Implementation of RNS-Based Distributed Arithmetic Discrete Wavelet Transform Architectures Using Field-Programmable Logic

Javier Ramírez Antonio García Uwe Meyer-Bäse Fred Taylor Antonio Lloris 《The Journal of VLSI Signal Processing》2003,33(1-2):171-190

Currently there are design barriers inhibiting the implementation of high-precision digital signal processing (DSP) objects with field programmable logic (FPL) devices. This paper explores overcoming these barriers by fusing together the popular distributed arithmetic (DA) method with the residue number system (RNS) for use in FPL-centric designs. The new design paradigm is studied in the context of a high-performance filter bank and a discrete wavelet transform (DWT). The proposed design paradigm is facilitated by a new RNS accumulator structure based on a carry save adder (CSA). The reported methodology also introduces a polyphase filter structure that results in a reduced look-up table (LUT) budget. The 2C-DA and RNS-DA are compared, in the context of a FPL implementation strategy, using a discrete wavelet transform (DWT) filter bank as a common design theme. The results show that the RNS-DA, compared to a traditional 2C-DA design, enjoys a performance advantage that increases with precision (wordlength). 相似文献

7.

Improved memoryless RNS forward converter based on the periodicity of residues

Premkumar A.B. Ang E.L. Lai E.M.-K. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(2):133-137

The residue number system (RNS) is suitable for DSP architectures because of its ability to perform fast carry-free arithmetic. However, this advantage is over-shadowed by the complexity involved in the conversion of numbers between binary and RNS representations. Although the reverse conversion (RNS to binary) is more complex, the forward transformation is not simple either. Most forward converters make use of look-up tables (memory). Recently, a memoryless forward converter architecture for arbitrary moduli sets was proposed by Premkumar in 2002. In this paper, we present an extension to that architecture which results in 44% less hardware for parallel conversion and achieves 43% improvement in speed for serial conversions. It makes use of the periodicity properties of residues obtained using modular exponentiation. 相似文献

8.

一种高阶线性FIR滤波器设计

喻秀明冯全源《微电子学》2021,51(5):685-689

为了解决高阶线性FIR滤波器占用查找表资源过多的问题,提出了一种采用对称查找表的分布式结构。利用线性FIR滤波器系数对称的特点,设计了深度更小的对称查找表。采用时分复用技术和流水线技术,有效节约了查找表资源,提高了FIR滤波器的运行频率。在Xilinx XC5VLX110T FPGA芯片上,实现了1 023阶的基于对称查找表的FIR滤波器。结果表明,相比于分段查找表结构,对称查找表结构的FIR滤波器节约了48%的Block Rom资源,提升了15%的最高时钟频率。相似文献

9.

CIS片上系统中伽玛校正的低功耗设计 总被引：1，自引：1，他引：0

钟健《光电子．激光》2010,(8):1151-1155

为了实现CMOS图像传感器(CIS)片上系统(SoC)中伽玛(γ)校正的低功耗设计,同时又保证校正的精度,提出一种查找表和直线拟合相结合的γ校正技术。算法对灰度值较低的像素使用直接查找表方法校正,对于γ曲线上升缓慢部分的像素采用分段直线拟合的方法。在直线分段时,使用外层分段与内层分段相结合的方法,达到了分段优化的目的。算法保证了图像校正精度,与使用完全查找表法相比,误差在0.5 pixel之内。基于该方法设计了一个8 bit输入/8 bit输出的VLSI模块,通过FPGA对模块进行了验证,模块占用723个LE和195个LC寄存器,比完全查找表法减少了硬件资源耗费,实现了低功耗设计。系统最大工作频率可达148 MHz,完全满足实时处理的需求。相似文献

10.

Fast base extension and precise scaling in RNS for look-up tableimplementations

《Signal Processing, IEEE Transactions on》1995,43(10):2427-2430

Both base extension and scaling are fundamental operations in residue computing and several techniques have been proposed previously for their efficient implementation. Using look-up tables, the best result (log₂ n table took-up cycles, where n is the number of residue moduli in the system) has been obtained by using the Chinese remainder theorem (CRT) at the expenses of a redundant representation of the numbers and of an approximated scaling. The CRT approach is reconsidered and it is shown that the same average time performances (log₂ n lookup cycles) can be achieved without any redundancy and with a precise result for scaling 相似文献

11.

Rijndael FPGA Implementations Utilising Look-Up Tables 总被引：1，自引：0，他引：1

Máire McLoone John V. McCanny 《The Journal of VLSI Signal Processing》2003,34(3):261-275

This paper presents single-chip FPGA Rijndael algorithm implementations of the Advanced Encryption Standard (AES) algorithm, Rijndael. In particular, the designs utilise look-up tables to implement the entire Rijndael Round function. A comparison is provided between these designs and similar existing implementations. Hardware implementations of encryption algorithms prove much faster than equivalent software implementations and since there is a need to perform encryption on data in real time, speed is very important. In particular, Field Programmable Gate Arrays (FPGAs) are well suited to encryption implementations due to their flexibility and an architecture, which can be exploited to accommodate typical encryption transformations. In this paper, a Look-Up Table (LUT) methodology is introduced where complex and slow operations are replaced by simple LUTs. A LUT-based fully pipelined Rijndael implementation is described which has a pre-placement performance of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design in which look-up tables are utilised to implement only one of the Round function transformations, and 6 times faster than other previous single-chip implementations. Iterative Rijndael implementations based on the Look-Up-Table design approach are also discussed and prove faster than typical iterative implementations. 相似文献

12.

Scalable, Memory Efficient, High-Speed IP Lookup Algorithms

《Networking, IEEE/ACM Transactions on》2005,13(4):802-812

One of the central issues in router performance is IP address lookup based on longest prefix matching. IP address lookup algorithms can be evaluated on a number of metrics—lookup time, update time, memory usage, and to a less important extent, the time to construct the data structure used to support lookups and updates. Many of the existing methods are geared toward optimizing a specific metric, and do not scale well with the ever expanding routing tables and the forthcoming IPv6 where the IP addresses are 128 bits long. In contrast, our effort is directed at simultaneously optimizing multiple metrics and provide solutions that scale to IPv6, with its longer addresses and much larger routing tables. In this paper, we present two IP address lookup schemes—Elevator-Stairs algorithm and logW-Elevators algorithm. For a routing table with$N$prefixes, The Elevator-Stairs algorithm uses optimal$cal O(N)$memory, and achieves better lookup and update times than other methods with similar memory requirements. The logW-Elevators algorithm gives$cal O(log W)$lookup time, where$W$is the length of an IP address, while improving upon update time and memory usage. Experimental results using the MAE-West router with 29 487 prefixes show that the Elevator-Stairs algorithm gives an average throughput of 15.7 Million lookups per second (Mlps) using 459KB of memory, and the logW-Elevators algorithm gives an average throughput of 21.41Mlps with a memory usage of 1259KB. 相似文献

13.

Design and Implementation of High-Performance RNS Wavelet Processors Using Custom IC Technologies

Javier Ramírez Uwe Meyer-Bäse Fred Taylor Antonio García Antonio Lloris 《The Journal of VLSI Signal Processing》2003,34(3):227-237

The design of high performance, high precision, real-time digital signal processing (DSP) systems, such as those associated with wavelet signal processing, is a challenging problem. This paper reports on the innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks. The disclosed system uses an enhanced index-transformation defined over Galois fields to efficiently support different wavelet filter instantiations without adding any extra cost or additional look-up tables (LUT). A selection of a small wordwidth modulus set are the keys for attaining low-complexity and high-throughput. An exhaustive comparison against existing two's complement (2C) designs for different custom IC technologies was carried out. Results reveal a performance improvement of up to 100% for high-precision RNS-based systems. These structures demonstrated to be well suited for field programmable logic (FPL) assimilation as well as for CBIC (cell-based integrated circuit) technologies. 相似文献

14.

基于哈希表查询的CAVLC解码查表优化算法

伍学民《电视技术》2014,38(11)

主要针对当前H.264/AVC中CAVLC中的标准解码方法 TLSS查表时存在查表时间长的问题,提出了一种全新的基于哈希表快速查询的CAVLC解码查表优化方法。在CAVLC解码查表中引入哈希表查找技术,提高了CAVLC解码查表速度,降低了CAVLC解码中不规则可变长码表(UVLCT)的码字获取时间,从而减少CAVLC解码查表时间。实验仿真结果表明,在没有丝毫降低视频解码质量前提下,相比于标准TLSS方法,提出的新算法可以提高约18%~22%的表查找时间。相似文献

15.

Longest prefix matching using bloom filters

Dharmapurikar S. Krishnamurthy P. Taylor D.E. 《Networking, IEEE/ACM Transactions on》2006,14(2):397-409

We introduce the first algorithm that we are aware of to employ Bloom filters for longest prefix matching (LPM). The algorithm performs parallel queries on Bloom filters, an efficient data structure for membership queries, in order to determine address prefix membership in sets of prefixes sorted by prefix length. We show that use of this algorithm for Internet Protocol (IP) routing lookups results in a search engine providing better performance and scalability than TCAM-based approaches. The key feature of our technique is that the performance, as determined by the number of dependent memory accesses per lookup, can be held constant for longer address lengths or additional unique address prefix lengths in the forwarding table given that memory resources scale linearly with the number of prefixes in the forwarding table. Our approach is equally attractive for Internet Protocol Version 6 (IPv6) which uses 128-bit destination addresses, four times longer than IPv4. We present a basic version of our approach along with optimizations leveraging previous advances in LPM algorithms. We also report results of performance simulations of our system using snapshots of IPv4 BGP tables and extend the results to IPv6. Using less than 2 Mb of embedded RAM and a commodity SRAM device, our technique achieves average performance of one hash probe per lookup and a worst case of two hash probes and one array access per lookup. 相似文献

16.

AES算法中SubBytes变换的高速硬件实现 总被引：2，自引：1，他引：1

高磊戴冠中《微电子学与计算机》2006,23(7):47-49

SubBytes交换是AES算法中唯一的非线性变换，也是硬件实现模块中的关键部分。文章在研究有限域GF（2g）与其复合域GF（（2^4）^2）变换的基础上，采用组合逻辑替代RAM查表的方法实现SubBytes变换，并在其内部实现了三级流水线。在AhemEP20KE系列的FPGA上进行了综合仿真验证，基于此高速SubBytes变换实现方法所设计的AES-128模块在ECB模式下的理论最大加密处理速度达到了12Gbps。相似文献

17.

Gamma校正的快速算法及其C语言实现 总被引：1，自引：0，他引：1

曾嘉亮《信息技术》2006,30(4):82-85

Gamma校正是数字图像显示前必不可少的操作。若直接套用公式来实现的话，执行效率非常低；许多文献都提到可以使用查找表来加速这一操作，然则语焉不详，并未具体介绍实现方法。在对gamma校正原理进行深入研究的基础上，构造出gamma校正查找表，并提出了运用该表对数字图像进行快速gammaa校正的方法。该算法特别适用于在嵌入式系统中处理视频流。相似文献

18.

A current modulation scheme for direct torque control of switched reluctance motor using fuzzy logic

《Mechatronics》2000,10(3):353-370

The paper addresses a fundamental control issue in switched reluctance motor (SRM) drives — the torque ripples. Normally, torque ripple minimization is achieved by using a look-up tables, i.e., the look-up tables uses stored magnetic characteristics to provide the reference current, on-angle, and off-angle for a given torque. A number of techniques for the generation of reference current profiles that minimize the torque ripples have also been suggested in the past. But due to highly nonlinear characteristics of the SRM, all these schemes are not fully successful. Moreover, their performance depends greatly on the accuracy of the magnetic characteristics measurements of the motor on which most of these algorithms work. Our work is primarily motivated to modulate the reference phase current pattern with the aid of fuzzy logic which is well suited to compensate for the nonlinearities of the system, so that the torque ripples are further suppressed. Performance of the proposed strategy is verified by computer simulation. 相似文献

19.

动态查找表设计方案研究

邹云伟李冰《电子与封装》2007,7(12):15-18,45

查找表(Look-up tables,LUT)越来越广泛的应用于各个领域:图像色彩处理、CDMA编码、电子词典等。但是它应用最广泛的还是电子网络领域的重构技术、开关技术和多路技术等。如:脉冲编码调制(PCM)开关、传感器信号的处理、网络的异步传输(ATM,Asynchronous Transfer Mode)。从本质上来说,LUT执行了这样的一个过程:若干的输入数据通过查找表的映射处理之后形成若干的输出数据。本文主要针对资源消耗、查询速度两个方面讨论了几种LUT的设计方法并进行了相互比较。相似文献

20.

Guest Editorial: Special Issue on Signal Processing Systems: Part I

Catthoor Francky Moonen Marc 《Journal of Signal Processing Systems》2003,35(3):227-228

The design of high performance, high precision, real-time digital signal processing (DSP) systems, such as those associated with wavelet signal processing, is a challenging problem. This paper reports on the innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks. The disclosed system uses an enhanced index-transformation defined over Galois fields to efficiently support different wavelet filter instantiations without adding any extra cost or additional look-up tables (LUT). A selection of a small wordwidth modulus set are the keys for attaining low-complexity and high-throughput. An exhaustive comparison against existing two's complement (2C) designs for different custom IC technologies was carried out. Results reveal a performance improvement of up to 100% for high-precision RNS-based systems. These structures demonstrated to be well suited for field programmable logic (FPL) assimilation as well as for CBIC (cell-based integrated circuit) technologies.

相似文献