共查询到20条相似文献,搜索用时 0 毫秒
1.
The discrete wavelet transform (DWT) provides a new method for signal/image analysis where high frequency components are studied with finer time resolution and low frequency components with coarser time resolution. It decomposes a signal or an image into localized contributions for multiscale analysis. In this paper, we present a parallel pipelined VLSI array architecture for 2D dyadic separable DWT. The 2D data array is partitioned into non-overlapping groups of rows. All rows in a partition are processed in parallel, and consecutive partitions are pipelined. Moreover, multiple wavelet levels are computed in the same pipeline, and multiple DWT problems can be pipelined also. The whole computation requires a single scan of the image data array. Thus, it is suitable for on-line real-time applications. For anN×N image, anm-level DWT can be computed in
time units on a processor costing no more than
, whereq is the partition size,p is the length of corresponding 1D DWT filters,C
m
andC
a
are the costs of a parallel multiplier and a parallel adder respectively, and a time unit is the time for a multiplication and an addition. Forq=N m, the computing time reduces to
. When a large number of DWT problems are pipelined, the computing time is about
per problem. 相似文献
2.
A new high-performance systolic architecture for calculating the discrete Fourier transform (DFT) is described which is based on two levels of transform factorization. One level uses an index remapping that converts the direct transform into structured sets of arithmetically simple four-point transforms. Another level adds a row/column decomposition of the DFT. The architecture supports transform lengths that are not powers of two or based on products of coprime numbers. Compared to previous systolic implementations, the architecture is computationally more efficient and uses less hardware. It provides low latency as well as high throughput, and can do both one- and two-dimensional DFTs. An automated computer-aided design tool was used to find latency and throughput optimal designs that matched the target field programmable gate array structure and functionality. 相似文献
3.
This paper presents an optimized 3-D Discrete Wavelet Transform (3-DDWT) architecture. 1-DDWT employed for the design of 3-DDWT architecture uses reduced lifting scheme approach. Further the architecture is optimized by applying block enabling technique, scaling, and rounding of the filter coefficients. The proposed architecture uses biorthogonal (9/7) wavelet filter. The architecture is modeled using Verilog HDL, simulated using ModelSim, synthesized using Xilinx ISE and finally implemented on Virtex-5 FPGA. The proposed 3-DDWT architecture has slice register utilization of 5%, operating frequency of 396 MHz and a power consumption of 0.45 W. 相似文献
4.
Flipping structure: an efficient VLSI architecture for lifting-based discrete wavelet transform 总被引:6,自引:0,他引:6
Chao-Tsung Huang Po-Chih Tseng Liang-Gee Chen 《Signal Processing, IEEE Transactions on》2004,52(4):1080-1089
In this paper, an efficient very large scale integration (VLSI) architecture, called flipping structure, is proposed for the lifting-based discrete wavelet transform. It can provide a variety of hardware implementations to improve and possibly minimize the critical path as well as the memory requirement of the lifting-based discrete wavelet transform by flipping conventional lifting structures. The precision issues are also analyzed. By case studies of the JPEG2000 default lossy (9,7) filter, an integer (9,7) filter, and the (6,10) filter, the efficiency of the proposed flipping structure is demonstrated. 相似文献
5.
King-Chu Hung Yu-Jung Huang Trieu-Kien Truong Chia-Ming Wang 《Electronics letters》1998,34(7):639-640
An operator correlation-based algorithm and its VLSI architecture For computing the 2D discrete wavelet transform is presented. The proposed discrete wavelet transform architecture was simulated in Verilog and synthesised with the FPGA compiler. The implementation for the 2D discrete wavelet transform on an FPGA-based design style is described 相似文献
6.
7.
Biorthogonal discrete wavelet transform (BDWT) has gained general acceptance as an image processing tool. For example, the
JPEG2000 standard is completely based on the BDWT. In BDWT, the scaling (low-pass) and wavelet (high-pass) filters are symmetric
and linear phase. In this work we show that by using a specific sign modulator the BDWT filter bank can be realized by only
two biorthogonal filters. The analysis and synthesis parts use the same scaling and wavelet filters, which simplifies especially
VLSI designs of the biorthogonal DWT/IDWT transceiver units. Utilizing the symmetry of the scaling and the wavelet filters
we introduce a fast convolution algorithm for implementation of the filter modules. In multiplexer–demultiplexer VLSI applications
both functions can be constructed via two running BDWT filters and the sign modulator.
This work was supported by the National Technology Agency of Finland (TEKES). 相似文献
8.
Chiper D.F. Swamy M.N.S. Ahmad M.O. Stouraitis T. 《Signal Processing, IEEE Transactions on》2002,50(9):2347-2354
An efficient approach to design very large scale integration (VLSI) architectures and a scheme for the implementation of the discrete sine transform (DST), based on an appropriate decomposition method that uses circular correlations, is presented. The proposed design uses an efficient restructuring of the computation of the DST into two circular correlations, having similar structures and only one half of the length of the original transform; these can be concurrently computed and mapped onto the same systolic array. Significant improvement in the computational speed can be obtained at a reduced input-output (I/O) cost and low hardware complexity, retaining all the other benefits of the VLSI implementations of the discrete transforms, which use circular correlation or cyclic convolution structures. These features are demonstrated by comparing the proposed design with some of the previously reported schemes. 相似文献
9.
This article presents a parallel architecture for 3-D discrete wavelet transform (3-DDWT). The proposed design is based on the 1-D pipelined lifting scheme. The architecture is fully scalable beyond the present coherent Daubechies filter bank (9,?7). This 3-DDWT architecture has advantages such as no group of pictures restriction and reduced memory referencing. It offers low power consumption, low latency and high throughput. The computing technique is based on the concept that lifting scheme minimises the storage requirement. The application specific integrated circuit implementation of the proposed architecture is done by synthesising it using 65?nm Taiwan Semiconductor Manufacturing Company standard cell library. It offers a speed of 486?MHz with a power consumption of 2.56?mW. This architecture is suitable for real-time video compression even with large frame dimensions. 相似文献
10.
King-Chu Hung Yao-Shan Hung Yu-Jung Huang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(5):565-576
A modified two-dimensional (2-D) discrete periodized wavelet transform (DPWT) based on the homeomorphic high-pass filter and the 2-D operator correlation algorithm is developed in this paper. The advantages of this modified 2-D DPWT are that it can reduce the multiplication counts and the complexity of boundary data processing in comparison to other conventional 2-D DPWT for perfect reconstruction. In addition, a parallel-pipeline architecture of the nonseparable computation algorithm is also proposed to implement this modified 2-D DPWT. This architecture has properties of noninterleaving input data, short bus width request, and short latency. The analysis of the finite precision performance shows that nearly half of the bit length can be saved by using this nonseparable computation algorithm. The operation of the boundary data processing is also described in detail. In the three-stage decomposition of an N×N image, the latency is found to be N2+2N+18 相似文献
11.
12.
13.
The recursive pyramid algorithm (RPA) is a reformulation of the classical pyramid algorithm (PA) for computing the discrete wavelet transform (DWT). The RPA computes the N-point DWT in real time (running DWT) using just L(log N-1) words of storage, as compared with O(N) words required by the PA. L is the length of the wavelet filter. The RPA is combined with the short-length FIR filter algorithms to reduce the number of multiplications and additions 相似文献
14.
Seung-Kwon Pack Lee-Sup Kim 《Electronics letters》1998,34(6):537-538
A cost-effective VLSI architecture with separate data-paths and their corresponding filter structure is proposed for performing a two-dimensional discrete wavelet transform (2D DWT). Compared with the conventional 2D DWT VLSI architectures, the proposed semi-recursive 2D DWT VLSI architecture has minimum hardware cost, and optimised data-bus utilisation, scheduling control overhead and storage size 相似文献
15.
Bing-Fei Wu Chung-Fu Lin 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(4):304-308
The experimental results show that using a larger tile size to perform JPEG 2000 coding results in better image quality (i.e., greater than or equal to 256 /spl times/ 256 tile image). However, processing large tile images also requires relatively high memory for the hardware implementation. For example, it would require tile memory of 256 K words to support the process of a 512 /spl times/ 512 tile image in the straightforward architecture. To reduce hardware resources, we have proposed the quad code-block (QCB) -based discrete wavelet transform method to reduce the size of tile memory by a factor of 4. In this paper, the remaining 1/4 tile memory can be further reduced through two approaches: the zero-holding extension with slight image degradation and the QCB-block size extension without any image degradation. That is, it only requires 12 K words tile memory to support the process of 512 /spl times/ 512 tile image by using zero-holding extension, and 13.58 K words memory through QCB-block size extension. The low memory requirement makes the on-chip memory practicable. 相似文献
16.
17.
In this paper, we present a general framework for computing full reference image quality scores in the discrete wavelet domain using the Haar wavelet. In our framework, quality metrics are categorized as either map-based, which generate a quality (distortion) map to be pooled for the final score, e.g., structural similarity (SSIM), or nonmap-based, which only give a final score, e.g., Peak signal-to-noise ratio (PSNR). For map-based metrics, the proposed framework defines a contrast map in the wavelet domain for pooling the quality maps. We also derive a formula to enable the framework to automatically calculate the appropriate level of wavelet decomposition for error-based metrics at a desired viewing distance. To consider the effect of very fine image details in quality assessment, the proposed method defines a multi-level edge map for each image, which comprises only the most informative image subbands. To clarify the application of the framework in computing quality scores, we give some examples to show how the framework can be applied to improve well-known metrics such as SSIM, visual information fidelity (VIF), PSNR, and absolute difference. The proposed framework presents an excellent tradeoff between accuracy and complexity. We compare the complexity of various algorithms obtained by the framework to the IPP-based H.264 baseline profile encoding using C/C++ implementations. For example, by using the framework, we can compute the VIF at about 5% of the complexity of its original version, but with higher accuracy. 相似文献
18.
19.
Ngai-Fong Law Wan-Chi Siu 《Signal Processing, IEEE Transactions on》2002,50(11):2806-2819
We have studied the computational complexity associated with the overcomplete wavelet transform for the commonly used spline wavelet family. By deriving general expressions for the computational complexity using the conventional filtering implementation, we show that the inverse transform is significantly more costly in computation than the forward transform. To reduce this computational complexity, we propose a new spatial implementation based on the exploitation of the correlation between the lowpass and the bandpass outputs that is inherent in the overcomplete representation. Both theoretical studies and experimental findings show that the proposed spatial implementation can greatly simplify the computations associated with the inverse transform. In particular, the complexity of the inverse transform using the proposed implementation can be reduced to slightly less than that of the forward transform using the conventional filtering implementation. We also demonstrate that the proposed scheme allows the use of an arbitrary boundary extension method while maintaining the ease of the inverse transform. 相似文献
20.
Lafruit G. Catthoor F. Cornelis J.P.H. De Man H.J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(1):56-68
A folded very large scale integration (VLSI) architecture is presented for the implementation of the two-dimensional discrete wavelet transform, without constraints on the choice of the wavelet-filter bank. The proposed architecture is dedicated to flexible block-oriented image processing, such as adaptive vector quantization used in wavelet image coding. We show that reading the image along a two-dimensional (2-D) pseudo-fractal scan creates a very modular and regular data flow and, therefore, considerably reduces the folding complexity and memory requirements for VLSI implementation. This leads to significant area savings for on-chip storage (up to a factor of two) and reduces the power consumption. Furthermore, data scheduling and memory management remain very simple. The end result is an efficient VLSI implementation with a reduced area cost compared to the conventional approaches, reading the input data line by line 相似文献