期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

VLSI architecture for fast 2D discrete orthonormal wavelet transform

Henry Y. H. Chuang Ling Chen 《The Journal of VLSI Signal Processing》1995,10(3):225-236

The discrete wavelet transform (DWT) provides a new method for signal/image analysis where high frequency components are studied with finer time resolution and low frequency components with coarser time resolution. It decomposes a signal or an image into localized contributions for multiscale analysis. In this paper, we present a parallel pipelined VLSI array architecture for 2D dyadic separable DWT. The 2D data array is partitioned into non-overlapping groups of rows. All rows in a partition are processed in parallel, and consecutive partitions are pipelined. Moreover, multiple wavelet levels are computed in the same pipeline, and multiple DWT problems can be pipelined also. The whole computation requires a single scan of the image data array. Thus, it is suitable for on-line real-time applications. For anN×N image, anm-level DWT can be computed in time units on a processor costing no more than , whereq is the partition size,p is the length of corresponding 1D DWT filters,C _m andC _a are the costs of a parallel multiplier and a parallel adder respectively, and a time unit is the time for a multiplication and an addition. Forq=N m, the computing time reduces to . When a large number of DWT problems are pipelined, the computing time is about per problem. 相似文献

2.

Computationally efficient systolic architecture for computing the discrete Fourier transform

Nash J.G. 《Signal Processing, IEEE Transactions on》2005,53(12):4640-4651

A new high-performance systolic architecture for calculating the discrete Fourier transform (DFT) is described which is based on two levels of transform factorization. One level uses an index remapping that converts the direct transform into structured sets of arithmetically simple four-point transforms. Another level adds a row/column decomposition of the DFT. The architecture supports transform lengths that are not powers of two or based on products of coprime numbers. Compared to previous systolic implementations, the architecture is computationally more efficient and uses less hardware. It provides low latency as well as high throughput, and can do both one- and two-dimensional DFTs. An automated computer-aided design tool was used to find latency and throughput optimal designs that matched the target field programmable gate array structure and functionality. 相似文献

3.

An efficient 3-dimensional discrete wavelet transform architecture for video processing application

Ganapathi Hegde Pukhraj Vaya 《电子科学学刊(英文版)》2012,29(6):534-540

This paper presents an optimized 3-D Discrete Wavelet Transform (3-DDWT) architecture. 1-DDWT employed for the design of 3-DDWT architecture uses reduced lifting scheme approach. Further the architecture is optimized by applying block enabling technique, scaling, and rounding of the filter coefficients. The proposed architecture uses biorthogonal (9/7) wavelet filter. The architecture is modeled using Verilog HDL, simulated using ModelSim, synthesized using Xilinx ISE and finally implemented on Virtex-5 FPGA. The proposed 3-DDWT architecture has slice register utilization of 5%, operating frequency of 396 MHz and a power consumption of 0.45 W. 相似文献

4.

Flipping structure: an efficient VLSI architecture for lifting-based discrete wavelet transform 总被引：6，自引：0，他引：6

Chao-Tsung Huang Po-Chih Tseng Liang-Gee Chen 《Signal Processing, IEEE Transactions on》2004,52(4):1080-1089

In this paper, an efficient very large scale integration (VLSI) architecture, called flipping structure, is proposed for the lifting-based discrete wavelet transform. It can provide a variety of hardware implementations to improve and possibly minimize the critical path as well as the memory requirement of the lifting-based discrete wavelet transform by flipping conventional lifting structures. The precision issues are also analyzed. By case studies of the JPEG2000 default lossy (9,7) filter, an integer (9,7) filter, and the (6,10) filter, the efficiency of the proposed flipping structure is demonstrated. 相似文献

5.

FPGA implementation for 2D discrete wavelet transform

King-Chu Hung Yu-Jung Huang Trieu-Kien Truong Chia-Ming Wang 《Electronics letters》1998,34(7):639-640

An operator correlation-based algorithm and its VLSI architecture For computing the 2D discrete wavelet transform is presented. The proposed discrete wavelet transform architecture was simulated in Verilog and synthesised with the FPGA compiler. The implementation for the 2D discrete wavelet transform on an FPGA-based design style is described 相似文献

6.

基于行的实时、二维提升整数小波变换VLSI结构

王柯俨刘凯郭杰李云松吴成柯《电路与系统学报》2010,15(2)

提出一种基于行的实时、二维提升整数小波变换的VLSI结构。该结构包括行变换器、列变换器、中间缓存器以及输出控制单元。利用中间缓存器暂存行变换的中间结果,由输出控制单元按优先级从高到低的顺序依次输出各级小波系数。由于在硬件实现中采用基于行的提升变换结构,从而水平和垂直方向上的变换能并行处理。与现有结构相比,该结构具有并行度高、存储量低的特点,并且能够在一幅图像逐行扫描的时间间隔内完成整幅图像的多级小波变换。相似文献

7.

Simplified biorthogonal discrete wavelet transform for VLSI architecture design

Hannu Olkkonen Juuso T. Olkkonen 《Signal, Image and Video Processing》2008,2(2):101-105

Biorthogonal discrete wavelet transform (BDWT) has gained general acceptance as an image processing tool. For example, the JPEG2000 standard is completely based on the BDWT. In BDWT, the scaling (low-pass) and wavelet (high-pass) filters are symmetric and linear phase. In this work we show that by using a specific sign modulator the BDWT filter bank can be realized by only two biorthogonal filters. The analysis and synthesis parts use the same scaling and wavelet filters, which simplifies especially VLSI designs of the biorthogonal DWT/IDWT transceiver units. Utilizing the symmetry of the scaling and the wavelet filters we introduce a fast convolution algorithm for implementation of the filter modules. In multiplexer–demultiplexer VLSI applications both functions can be constructed via two running BDWT filters and the sign modulator. This work was supported by the National Technology Agency of Finland (TEKES). 相似文献

8.

A systolic array architecture for the discrete sine transform

Chiper D.F. Swamy M.N.S. Ahmad M.O. Stouraitis T. 《Signal Processing, IEEE Transactions on》2002,50(9):2347-2354

An efficient approach to design very large scale integration (VLSI) architectures and a scheme for the implementation of the discrete sine transform (DST), based on an appropriate decomposition method that uses circular correlations, is presented. The proposed design uses an efficient restructuring of the computation of the DST into two circular correlations, having similar structures and only one half of the length of the original transform; these can be concurrently computed and mapped onto the same systolic array. Significant improvement in the computational speed can be obtained at a reduced input-output (I/O) cost and low hardware complexity, retaining all the other benefits of the VLSI implementations of the discrete transforms, which use circular correlation or cyclic convolution structures. These features are demonstrated by comparing the proposed design with some of the previously reported schemes. 相似文献

9.

A parallel 3-D discrete wavelet transform architecture using pipelined lifting scheme approach for video coding

Ganapathi Hegde Pukhraj Vaya 《International Journal of Electronics》2013,100(10):1429-1440

This article presents a parallel architecture for 3-D discrete wavelet transform (3-DDWT). The proposed design is based on the 1-D pipelined lifting scheme. The architecture is fully scalable beyond the present coherent Daubechies filter bank (9,?7). This 3-DDWT architecture has advantages such as no group of pictures restriction and reduced memory referencing. It offers low power consumption, low latency and high throughput. The computing technique is based on the concept that lifting scheme minimises the storage requirement. The application specific integrated circuit implementation of the proposed architecture is done by synthesising it using 65?nm Taiwan Semiconductor Manufacturing Company standard cell library. It offers a speed of 486?MHz with a power consumption of 2.56?mW. This architecture is suitable for real-time video compression even with large frame dimensions. 相似文献

10.

A nonseparable VLSI architecture for two-dimensional discreteperiodized wavelet transform

King-Chu Hung Yao-Shan Hung Yu-Jung Huang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(5):565-576

A modified two-dimensional (2-D) discrete periodized wavelet transform (DPWT) based on the homeomorphic high-pass filter and the 2-D operator correlation algorithm is developed in this paper. The advantages of this modified 2-D DPWT are that it can reduce the multiplication counts and the complexity of boundary data processing in comparison to other conventional 2-D DPWT for perfect reconstruction. In addition, a parallel-pipeline architecture of the nonseparable computation algorithm is also proposed to implement this modified 2-D DPWT. This architecture has properties of noninterleaving input data, short bus width request, and short latency. The analysis of the finite precision performance shows that nearly half of the bit length can be saved by using this nonseparable computation algorithm. The operation of the boundary data processing is also described in detail. In the three-stage decomposition of an N×N image, the latency is found to be N²+2N+18 相似文献

11.

二维9／7小波变换VLSI设计

朱斌杰杜慧敏杨晓强韩俊刚《国外电子元器件》2009,17(2):11-13,16

为了提高JPEG2000图像压缩速度,提出一种基于提升算法的二维离散9／7小波变换（DWT）Mesh结构的VLSI设计方案,利用这种Mesh结构的VLSI能够实现并行处理一个图像的所有像素点。这种并行处理的Mesh结构可提高小渡变换电路速度,以及图像压缩的速度。相似文献

12.

二维9/7小波变换VLSI设计 总被引：1，自引：0，他引：1

朱斌杰杜慧敏杨晓强韩俊刚《电子设计工程》2009,17(2)

为了提高JPEG2000图像压缩速度,提出一种基于提升算法的二维离散9/7小波变换(DWT)Mesh结构的VLSI设计方案,利用这种Mesh结构的VLSI能够实现并行处理一个图像的所有像素点.这种并行处理的Mesh结构可提高小波变换电路速度,以及图像压缩的速度. 相似文献

13.

The recursive pyramid algorithm for the discrete wavelet transform 总被引：3，自引：0，他引：3

Vishwanath M. 《Signal Processing, IEEE Transactions on》1994,42(3):673-676

The recursive pyramid algorithm (RPA) is a reformulation of the classical pyramid algorithm (PA) for computing the discrete wavelet transform (DWT). The RPA computes the N-point DWT in real time (running DWT) using just L(log N-1) words of storage, as compared with O(N) words required by the PA. L is the length of the wavelet filter. The RPA is combined with the short-length FIR filter algorithms to reduce the number of multiplications and additions 相似文献

14.

2D DWT VLSI architecture for wavelet image processing

Seung-Kwon Pack Lee-Sup Kim 《Electronics letters》1998,34(6):537-538

A cost-effective VLSI architecture with separate data-paths and their corresponding filter structure is proposed for performing a two-dimensional discrete wavelet transform (2D DWT). Compared with the conventional 2D DWT VLSI architectures, the proposed semi-recursive 2D DWT VLSI architecture has minimum hardware cost, and optimised data-bus utilisation, scheduling control overhead and storage size 相似文献

15.

Memory-efficient architecture for JPEG 2000 coprocessor with large tile image

Bing-Fei Wu Chung-Fu Lin 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(4):304-308

The experimental results show that using a larger tile size to perform JPEG 2000 coding results in better image quality (i.e., greater than or equal to 256 /spl times/ 256 tile image). However, processing large tile images also requires relatively high memory for the hardware implementation. For example, it would require tile memory of 256 K words to support the process of a 512 /spl times/ 512 tile image in the straightforward architecture. To reduce hardware resources, we have proposed the quad code-block (QCB) -based discrete wavelet transform method to reduce the size of tile memory by a factor of 4. In this paper, the remaining 1/4 tile memory can be further reduced through two approaches: the zero-holding extension with slight image degradation and the QCB-block size extension without any image degradation. That is, it only requires 12 K words tile memory to support the process of 512 /spl times/ 512 tile image by using zero-holding extension, and 13.58 K words memory through QCB-block size extension. The low memory requirement makes the on-chip memory practicable. 相似文献

16.

二维离散5/3小波变换并行VLSI结构设计

杜会斌周旭张学庆吴晓娟《无线电通信技术》2006,32(6):39-41

提出了一种基于提升算法的二维离散5/3小波变换(DWT)高效并行VLSI结构设计方法。该方法使得行和列滤波器同时进行滤波,采用流水线设计方法处理,在保证同样的精度下,大大减少了运算量,提高了变换速度,节约了硬件资源。该方法已通过了VerilogHDL行为级仿真验证,可作为单独的IP核应用在JPEG2000图像编、解码芯片中。该结构可推广到9/7小波提升结构。相似文献

17.

A novel discrete wavelet transform framework for full reference image quality assessment

Soroosh Rezazadeh Stéphane Coulombe 《Signal, Image and Video Processing》2013,7(3):559-573

In this paper, we present a general framework for computing full reference image quality scores in the discrete wavelet domain using the Haar wavelet. In our framework, quality metrics are categorized as either map-based, which generate a quality (distortion) map to be pooled for the final score, e.g., structural similarity (SSIM), or nonmap-based, which only give a final score, e.g., Peak signal-to-noise ratio (PSNR). For map-based metrics, the proposed framework defines a contrast map in the wavelet domain for pooling the quality maps. We also derive a formula to enable the framework to automatically calculate the appropriate level of wavelet decomposition for error-based metrics at a desired viewing distance. To consider the effect of very fine image details in quality assessment, the proposed method defines a multi-level edge map for each image, which comprises only the most informative image subbands. To clarify the application of the framework in computing quality scores, we give some examples to show how the framework can be applied to improve well-known metrics such as SSIM, visual information fidelity (VIF), PSNR, and absolute difference. The proposed framework presents an excellent tradeoff between accuracy and complexity. We compare the complexity of various algorithms obtained by the framework to the IPP-based H.264 baseline profile encoding using C/C++ implementations. For example, by using the framework, we can compute the VIF at about 5% of the complexity of its original version, but with higher accuracy. 相似文献

18.

High-speed and memory-efficient VLSI design of 2D DWT for JPEG2000

《Electronics letters》2006,42(16):907-908

相似文献

19.

An efficient computational scheme for the two-dimensional overcomplete wavelet transform

Ngai-Fong Law Wan-Chi Siu 《Signal Processing, IEEE Transactions on》2002,50(11):2806-2819

We have studied the computational complexity associated with the overcomplete wavelet transform for the commonly used spline wavelet family. By deriving general expressions for the computational complexity using the conventional filtering implementation, we show that the inverse transform is significantly more costly in computation than the forward transform. To reduce this computational complexity, we propose a new spatial implementation based on the exploitation of the correlation between the lowpass and the bandpass outputs that is inherent in the overcomplete representation. Both theoretical studies and experimental findings show that the proposed spatial implementation can greatly simplify the computations associated with the inverse transform. In particular, the complexity of the inverse transform using the proposed implementation can be reduced to slightly less than that of the forward transform using the conventional filtering implementation. We also demonstrate that the proposed scheme allows the use of an arbitrary boundary extension method while maintaining the ease of the inverse transform. 相似文献

20.

An efficient VLSI architecture for 2-D wavelet image coding withnovel image scan

Lafruit G. Catthoor F. Cornelis J.P.H. De Man H.J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(1):56-68

A folded very large scale integration (VLSI) architecture is presented for the implementation of the two-dimensional discrete wavelet transform, without constraints on the choice of the wavelet-filter bank. The proposed architecture is dedicated to flexible block-oriented image processing, such as adaptive vector quantization used in wavelet image coding. We show that reading the image along a two-dimensional (2-D) pseudo-fractal scan creates a very modular and regular data flow and, therefore, considerably reduces the folding complexity and memory requirements for VLSI implementation. This leads to significant area savings for on-chip storage (up to a factor of two) and reduces the power consumption. Furthermore, data scheduling and memory management remain very simple. The end result is an efficient VLSI implementation with a reduced area cost compared to the conventional approaches, reading the input data line by line 相似文献