首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a novel VLSI algorithm for the computation of a two-dimensional discrete cosine transform is proposed. The 2D-DCT equation can be expressed by the sum of high order cosine functions, and the algorithm can be realized by combining a highly efficient first order recursive structure with some simplified matrix multiplications, which results in highly regular hardware architecture and simple routing. The algorithm has temporal and spatial locality of connection and can be segmentized for pipeline operations, so the computation time is greatly reduced. Owing to the simplicity in hardware structure, it is especially good for VLSI implementation.  相似文献   

2.
A new implementation of an 8 /spl times/ 8 two-dimensional discrete cosine transform (2D-DCT) processor based on the residue number system (RNS) is presented. This architecture makes use of a fast cosine transform algorithm. It is shown that the RNS implementation of the 2D-DCT over field-programmable logic devices leads to a 129% throughput improvement over the equivalent binary system.  相似文献   

3.
This paper presents a cost-effective 2D-DCT processor based on a fast row/column decomposition approach. With a particular schedule, the processor does not require the transposed memory for 2D-DCT computing. We re-arrange the cosine coefficients of the first and second 1D-DCT transformations to keep DC-coefficient error free. The new architecture uses state-machines to generate cosine coefficients rather than ROM table, to save the memory cells and the address generator. For 8 × 8 DCT realization, the circuit only needs 36 adders without multipliers, and the whole chip uses about 19 k transistors. The chip area is about 4 mm 2 using TSMC 0.35 um CMOS process. The circuit complexity is only 1/3 ~ 1/5 of the conventional DCT chips.  相似文献   

4.
The 2D-discrete cosine transform (2D-DCT) is one of the popular transformation for video coding. Yet, 2D-DCT may not be able to efficiently represent video data with fewer coefficients for oblique featured blocks. To further improve the compression gain for such oblique featured video data, this paper presents a directional transform framework based on direction-adaptive fixed length discrete cosine transform (DAFL-DCT) for intra-, and inter-frame. The proposed framework selects the best suitable transform mode from eight proposed directional transform modes for each block, and modified zigzag scanning pattern rearranges these transformed coefficients into a 1D-array, suitable for entropy encoding. The proposed scheme is analysed on JM 18.6 of H.264/AVC platform. Performance comparisons have been made with respect to rate-distortion (RD), Bjontegaard metrics, encoding time etc. The proposed transform scheme outperforms the conventional 2D-DCT and other state-of-art techniques in terms of compression gain and subjective quality.  相似文献   

5.

This work introduces the three-dimensional steerable discrete cosine transform (3D-SDCT), which is obtained from the relationship between the discrete cosine transform (DCT) and the graph Fourier transform of a signal on a path graph. One employs the fact that the basis vectors of the 3D-DCT constitute a possible eigenbasis for the Laplacian of the product of such graphs. The proposed transform employs a rotated version of the 3D-DCT basis. We then evaluate the applicability of the 3D-SDCT in the field of 3D medical image compression. We consider the case where we have only one pair of rotation angles per block, rotating all the 3D-DCT basis vectors by the same pair. The obtained results show that the 3D-SDCT can be efficiently used in the referred application scenario and it outperforms the classical 3D-DCT.

  相似文献   

6.
This study presents a design of two-dimensional (2D) discrete cosine transform (DCT) hardware architecture dedicated for High Efficiency Video Coding (HEVC) in field programmable gate array (FPGA) platforms. The proposed methodology efficiently proceeds 2D-DCT computation to fit internal components and characteristics of FPGA resources. A four-stage circuit architecture is developed to implement the proposed methodology. This architecture supports variable size of DCT computation, including 4 × 4, 8 × 8, 16 × 16, and 32 × 32. The proposed architecture has been implemented in System Verilog and synthesized in various FPGA platforms. Compared with existing related works in literature, this proposed architecture demonstrates significant advantages in hardware cost and performance improvement. The proposed architecture is able to sustain 4 K@30 fps ultra high definition (UHD) TV real-time encoding applications with a reduction of 31–64% in hardware cost.  相似文献   

7.
This paper proposes two new 2D-spectral estimation methods. The 2D-modified magnitude group delay (MMGD) is applied to 2D-discrete Fourier transform (2D-DFT) for the first and to the analytic 2D-discrete Cosine transform for the second. The analytic 2D-DCT preserves the desirable properties of the DCT (like, improved frequency resolution, leakage and detectability) and is realized by a 2D-discrete cosine transform (2D-DCT) and its Hilbert transform. The 2D-MMGD is an extension from 1D to 2D, and it reduces the variance preserving the original frequency resolution of 2D-DFT or 2D-analytic DCT, depending upon to which is applied. The first and the second methods are referred to as DFT-MMGD and DCT-MMGD, respectively. The proposed methods are applied to 2D sinusoids and 2D AR process, associated with Gaussian white noise. The performance of the DCT-MMGD is found to be superior to that of DFT-MMGD in terms of variance, frequency resolution and detectability. The performance of DFT-MMGD and DCT-MMGD is better than that of 2D-LP method even when the signal to noise ratio is low.  相似文献   

8.
以数字音视频编解码技术标准(Audio Video coding Standard,AVS)为背景,从离散余弦交换(Discrete Cosine Transform,DCT)的基本原理入手,研究了一种基于现场可编辑门阵列(Field Programmable Gate Array,FPGA)实现快速2D-DCT变换的方法.设计采用行列分解法把8×8的2D-DCT变换分解为2个1D-DCT,用移位求和的方法实现乘法器运算.同时,只用1个1D-DCT模块实现2D-DCT变换,节省了硬件资源,更提高了运算速度.最后,利用FPGA仿真工具MODELSIM SE 6.2b,完成了FPGA的实现与仿真结果.  相似文献   

9.
Ma  W. 《Electronics letters》1991,27(3):201-202
The algorithm and architecture of a 2-D systolic array processor for the DCT (discrete cosine transform) are proposed. It is based on the relationship between DCT and cosine DFT and sine DFT. Two systolic architectures of 1-D DCT data and control flow computation are discussed. By use of the main feature of the two systolic 1-D arrays for DCT, a full 2-D systolic DCT array is presented.<>  相似文献   

10.
何业军  刘鹏  雷海军  提干  李先义 《电视技术》2011,35(15):68-70,83
提出了一种基于5级流水线的高精度向量乘法器的二维DCT VLSI结构.采用一维DCT行处理,转置RAM存储器,一维DCT列处理的流水线结构代替复用一维DCT算法以提高速度,并且在一维DCT算法模块中,对于系数乘法,采用并行乘法的结构,可以进一步提高运算速度.在高精度方面,采用移位的方案,精度精确到小数点后5位,满足高精...  相似文献   

11.
在TD-LTE(Time Division-Long Term Evolution)系统中,由于高速移动产生多普勒频偏,使传统的基于DFT(Discrete Fourier Transform)插值信道估计算法性能损失严重。为了解决上述问题,本文提出一种利用TD-LTE系统导频模式的二维离散余弦变换(2D-DCT)信道估计算法,并给出了详细的推导。仿真结果表明,该方法的性能优于2D-DFT算法,可以很好的接近于2D维纳滤波估计算法。  相似文献   

12.
基于三维DCT变换的体数据鲁棒 数字水印嵌入算法   总被引:4,自引:0,他引:4       下载免费PDF全文
刘旺  姜守达  孙圣和 《电子学报》2005,33(12):2174-2177
本文基于三维离散余弦变换(3D-DCT)技术和扩频通信技术提出三维体数据鲁棒水印嵌入算法.算法将原始的二值图像水印利用扩频通信技术生成可以直接嵌入的体水印信息;分别对体水印信息和原始的体数据进行分块离散余弦变换,在变换域内实现水印嵌入.该算法嵌入的水印信息具有不可见性,能够抵抗剪切、加噪、滤波、旋转等常见攻击.仿真实验验证了算法的有效性.  相似文献   

13.
This investigation proposes a novel radix-42 algorithm with the low computational complexity of a radix-16 algorithm but the lower hardware requirement of a radix-4 algorithm. The proposed pipeline radix-42 single delay feedback path (R42SDF) architecture adopts a multiplierless radix-4 butterfly structure, based on the specific linear mapping of common factor algorithm (CFA), to support both 256-point fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) and 8times8 2D discrete cosine transform (DCT) modes following with the high efficient feedback shift registers architecture. The segment shift register (SSR) and overturn shift register (OSR) structure are adopted to minimize the register cost for the input re-ordering and post computation operations in the 8times8 2D DCT mode, respectively. Moreover, the retrenched constant multiplier and eight-folded complex multiplier structures are adopted to decrease the multiplier cost and the coefficient ROM size with the complex conjugate symmetry rule and subexpression elimination technology. To further decrease the chip cost, a finite wordlength analysis is provided to indicate that the proposed architecture only requires a 13-bit internal wordlength to achieve 40-dB signal-to-noise ratio (SNR) performance in 256-point FFT/IFFT modes and high digital video (DV) compression quality in 8 times 8 2D DCT mode. The comprehensive comparison results indicate that the proposed cost effective reconfigurable design has the smallest hardware requirement and largest hardware utilization among the tested architectures for the FFT/IFFT computation, and thus has the highest cost efficiency. The derivation and chip implementation results show that the proposed pipeline 256-point FFT/IFFT/2D DCT triple-mode chip consumes 22.37 mW at 100 MHz at 1.2-V supply voltage in TSMC 0.13-mum CMOS process, which is very appropriate for the RSoCs IP of next-generation handheld devices.  相似文献   

14.
On the on-line computation of DCT-IV and DST-IV transforms   总被引:1,自引:0,他引:1  
Various options available for the on-line computation of discrete cosine transform-IV (DCT-IV) and discrete sine transform-IV (DST-IV) in hardware are considered and compared. A novel architecture for the simultaneous, real-time computation of both the transforms, based on the decomposition of the odd-time, odd-frequency discrete Fourier transform (O2 DFT), is also proposed  相似文献   

15.
The three-dimensional discrete cosine transform (3D-DCT) has been researched as an alternative to existing dominant video standards based on motion estimation and compensation. Since it does not need to search macro block for inter/intra prediction, 3D-DCT has great advantages for complexity. However, it has not been developed well because of poor video quality while video standards such as H.263(+) and HEVC have been blooming. In this paper, we propose a new 3D-DCT video coding as a new video solution for low power mobile technologies such as Internet of Things (IoT) and Drone. We focus on overcoming drawbacks reported in previous research. We build a complete 3D-DCT video coding system by adopting existing advanced techniques and devising new coding algorithms to improve overall performance of 3D-DCT. Experimental results show proposed 3D-DCT outperforms H.264 low power profiles while offering less complexity. From GBD-PSNR, proposed 3D-DCT provides better performance by average 4.6 dB.  相似文献   

16.
In this article, a novel block-based visible image watermark VLSI architecture design and its hardware implementation in field programmable gate array (FPGA) is proposed. In this watermarking process, 1D-DCT is introduced to facilitate hardware implementation. Mathematical model is developed to reduce the computational complexity for the calculation of embedding and scaling factors, which are used to make the resultant image of best quality with uniform watermark visibility. The proposed architecture has a 12–stage pipeline. Parallelism techniques are employed in block level in order to achieve high performance. A single 8-point fast 1D-DCT is used to calculate the DCT coefficient values of the host image and the watermark image to minimize the resource utilization and power consumption. The hardware implementation of this algorithm leads to numerous advantages including reduced power, area and higher pipeline throughput. The performance of the architecture is studied by implementing Xilinx Virtex V technology based FPGA with DSP 48E. Throughput achieved based on this VLSI architecture is 5.21 Gbits/s with a total resource utilization of 4058BELs.  相似文献   

17.
JPEG2000小波变换器的VLSI结构设计   总被引:3,自引:1,他引:2  
新一代静止图像压缩标准JPEG2000将离散小波变换(DWT)作为其核心变换技术,并推荐采用推举体制(lifting)快速算法来实现.空间组合推举体制算法(SCLA)大大降低了lifting的运算量.当选用9/7小波滤波器时,SCLA的乘法运算量只有lifting的7/12.本文提出了一种实现SCLA算法的VLSI结构,降低了基于lifting实现的运算量, 加快了变换的速度,减小了电路的规模.本文的二维正反小波变换器已经作为单独的IP核应用于我们目前正在开发的JPEG2000图像编解码芯片中.  相似文献   

18.
The authors present an efficient algorithm for the computation of the 4×4 discrete cosine transform (DCT). The algorithm is based on the decomposition of the 4×4 DCT into four 4-point 1-D DCTs. Thus, only 1-D transformations and some additions are required. It is shown that the proposed algorithm requires only 16 multiplications, which is half the number needed for the conventional row-column method. Since the 2m×2m DCT can be computed using the 4×4 DCT recursively for any m, the proposed algorithm leads to a fast algorithm for the computation of the 2-D DCT  相似文献   

19.
Theoretical and simulation results of using Hamming codes with the two-dimensional discrete cosine transform (2D-DCT) at a transmitted data rate of 1 bit/pixel over a binary symmetric channel (BSC) are presented. The design bit error rate (BER) of interest is 10-2. The (7, 4), (15, 11), and (31, 26) Hamming codes are used to protect the most important bits in each 16 by 16 transformed block, where the most important bits are determined by calculating the mean squared reconstruction error (MSE) contributed by a channel error in each individual bit. A theoretical expression is given which allows the number of protected bits to achieve minimum MSE for each code rate to be computed. By comparing these minima, the best code and bit allocation can be found. Objective and subjective performance results indicate that using the (7, 4) Hamming code to protect the most important 2D-DCT coefficients can substantially improve reconstructed image quality at a BER of 10-2. Furthermore, the allocation of 33 out of the 256 bits per block to channel coding does not noticeably degrade reconstructed image quality in the absence of channel errors.  相似文献   

20.
Two algorithms are given for the computation of the updated discrete cosine transform-II (DCT-II), discrete sine transform-II (DST-II), discrete cosine transform-IV (DCT-IV), and discrete sine transform-IV (DST-IV). It is pointed out that the algorithm used for running DCT-IV can also be used for computation for running DST-IV without additional computational overhead. An architecture which is common and suitable for VLSI implementation of the derived algorithms is also presented. Preliminary studies have shown that the architecture can easily be implemented in VLSI form, and, in conjunction with a high-speed digital signal processor (for example ADSP 2100A), it can be used for real-time transform domain LMS adaptive filtering (128 taps) of 8 kHz sample rate speech signals  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号