首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 515 毫秒
1.
提出了一种二维DCT快速算法的FPGA实现结构,采用快速算法将二维DCT分解成一维DCT的两次运算,其中一维DCT采用并行的流水线结构,提高电路的数据吞吐率和运算速度,通过系数矩阵的简化和蝶形运算结构的等价减少乘法器的消耗。提出了一种高效的矩阵转置实现方法,一个时钟可以完成8个数据读写。实验结果验证了二维DCT核设计功能和时序的正确性,最高可工作在110MHz,可用于基于DCT压缩的实时图像处理。  相似文献   

2.
赵滨  黄大庆 《电子设计工程》2011,19(24):126-129
提出了一种新的二维DCT和IDCT的FPGA实现结构,采用行列快速算法将二维算法分解为两个一维算法实现,其中每个一维算法采用并行的流水线结构,每一个时钟处理8个数据,大大提高电路的数据吞吐率和运算速度。通过Modelsim仿真工具对该设计进行仿真,证明该算法的功能的正确性,进行一次8*8的分块二维DCT变换仅仅需要16个时钟,满足图像以及视频实时性的要求。  相似文献   

3.
何业军  刘鹏  雷海军  提干  李先义 《电视技术》2011,35(15):68-70,83
提出了一种基于5级流水线的高精度向量乘法器的二维DCT VLSI结构.采用一维DCT行处理,转置RAM存储器,一维DCT列处理的流水线结构代替复用一维DCT算法以提高速度,并且在一维DCT算法模块中,对于系数乘法,采用并行乘法的结构,可以进一步提高运算速度.在高精度方面,采用移位的方案,精度精确到小数点后5位,满足高精...  相似文献   

4.
二维DCT算法及其精简的VLSI设计   总被引:1,自引:1,他引:0  
采用了快速算法,并通过矩阵的变化,得到了一维离散余弦变换(Discrete Cosine Transform,DCT)的一种快速实现,并由此提出一种精简的超大规模集成电路(Very-large-scale integration,VLSI)设计架构.使用了一维DCT的复用技术,带符号数的乘法器设计等技术,实现了二维DCT算法的精简的VLSI设计.实验结果表明,所设计的二维DCT设计有效,并能够获得非常精简的电路设计.  相似文献   

5.
介绍了一种支持MPEG2压缩协议,应用于ARM9内核、高速低功耗的二维DCT协处理设计研究.该协处理器利用行列分解法,并行优化实现二维DCT数据结构,明显提高了8×8数据块的处理速度.与此同时,应用改进的CORDIC算法——移位代替乘法并优化移位算法实现一维DCT.仿真结果表明,对于此种一维DCT算法硬件实现,在符合MPEG2精度和ARM9数据传输频率的前提下比文献[2]速度提高了30%,面积却减少了50%.这种协处理器可以在移动多媒体设备的编解码模块中得到广泛应用.  相似文献   

6.
一种基于高度并行结构的二维DCT/IDCT处理器设计   总被引:8,自引:2,他引:6  
本文介绍一种适用于MPEG-4视频简单层(Simple Profile Layer1-3)压缩编码的二维88 DCT/IDCT处理器设计,该处理器设计充分利用DCT与IDCT的相似性及算法对称性,用高度的并行结构来加快处理速度,采用一维DCT/IDCT单元复用的方式来实现二维DCT/IDCT运算和简化的乘法器设计,在满足处理速度和精度要求的基础上,利用较少的晶体管数目实现了一种高性能二维DCT/IDCT处理器。  相似文献   

7.
为了实现二维离散余弦变换(DCT)/逆离散余弦变换(IDCT),本文提供一种二维离散余弦变换/逆离散余弦变换电路,采用一个加法器和两个移位器代替一个乘法器,通过选择特定的系数,使得硬件电路无需使用耗费资源较多、速度较慢的乘法器,是一种高效的无乘法器的DCT变换电路。该电路只需要很少的加法器和移位器,并可以达到很高的精度。  相似文献   

8.
本文提出了一种基于矩阵向量乘法器的低功耗二维DCT结构,该结构通过最大限度地共享矩阵向量乘法中的乘积因子降低二维DCT中的乘法计算量,实现低功耗计算.此外,该二维DCT设计支持对矩阵向量乘法器的计算精度控制,从而实现对二维DCT处理器的低功耗调整.FPGA硬件平台的实际验证结果表明,与传统的基于移位累加乘法器的二维DCT设计相比,本设计可以节省35%以上的功耗.  相似文献   

9.
图像DCT变换是图像压缩的一项重要技术,如何准确、快速地进行图像压缩一直是国内外研究的热点.现研究了两种二维离散余弦变换(DCT)的方法.在DCT算法结构上利用了变换的可分离性和行列的可分解性,并采用行列分解的方法将二维DCT转换为2个串行的一维DCT实现.  相似文献   

10.
二维离散余弦(DCT)在H.264视频编码中承担者信号从时域到频域变换的作用。在现场可编程逻辑门阵列(FPGA)上设计了高效的采用流水线结构的H.264DCT硬件电路。首先,把二维4×4DCT变换转换成二次一维DCT变换;其次,DCT变换之间加一个两端口的RAM,以实现数列的转置;最后,在顶层设计一个有限状态机控制整个流程。该设计采用较少的资源实现了较好的功能,获得了可靠的实验结果。  相似文献   

11.
A direct method for the computation of 2-D DCT/IDCT on a linear-array architecture is presented. The 2-D DCT/IDCT is first converted into its corresponding I-D DCT/IDCT problem through proper input/output index reordering. Then, a new coefficient matrix factorisation is derived, leading to a cascade of several basic computation blocks. Unlike other previously proposed high-speed 2-D N /spl times/ N DCT/IDCT processors that usually require intermediate transpose memory and have computation complexity O(N/sup 3/), the proposed hardware-efficient architecture with distributed memory structure has computation complexity O(N/sup 2/ log/sub 2/ N) and requires only log/sub 2/ N multipliers. The new pipelinable and scalable 2-D DCT/IDCT processor uses storage elements local to the processing elements and thus does not require any address generation hardware or global memory-to-array routing.  相似文献   

12.
This paper presents a cost-effective 2D-DCT processor based on a fast row/column decomposition approach. With a particular schedule, the processor does not require the transposed memory for 2D-DCT computing. We re-arrange the cosine coefficients of the first and second 1D-DCT transformations to keep DC-coefficient error free. The new architecture uses state-machines to generate cosine coefficients rather than ROM table, to save the memory cells and the address generator. For 8 × 8 DCT realization, the circuit only needs 36 adders without multipliers, and the whole chip uses about 19 k transistors. The chip area is about 4 mm 2 using TSMC 0.35 um CMOS process. The circuit complexity is only 1/3 ~ 1/5 of the conventional DCT chips.  相似文献   

13.
Ma  W. 《Electronics letters》1991,27(3):201-202
The algorithm and architecture of a 2-D systolic array processor for the DCT (discrete cosine transform) are proposed. It is based on the relationship between DCT and cosine DFT and sine DFT. Two systolic architectures of 1-D DCT data and control flow computation are discussed. By use of the main feature of the two systolic 1-D arrays for DCT, a full 2-D systolic DCT array is presented.<>  相似文献   

14.
This paper presents a 2-D DCT/IDCT processor chip for high data rate image processing and video coding. It uses a fully pipelined row–column decomposition method based on two 1-D DCT processors and a transpose buffer based on D-type flip-flops with a double serial input/output data-flow. The proposed architecture allows the main processing elements and arithmetic units to operate in parallel at half the frequency of the data input rate. The main characteristics are: high throughput, parallel processing, reduced internal storage, and maximum efficiency in computational elements. The processor has been implemented using standard cell design methodology in 0.35 μm CMOS technology. It measures 6.25 mm2 (the core is 3 mm2) and contains a total of 11.7 k gates. The maximum frequency is 300 MHz with a latency of 172 cycles for 2-D DCT and 178 cycles for 2-D IDCT. The computing time of a block is close to 580 ns. It has been designed to meets the demands of IEEE Std. 1,180–1,990 used in different video codecs. The good performance in the computing speed and hardware cost indicate that this processor is suitable for HDTV applications. This work was supported by the Spanish Ministry of Science and Technology (TIC2000-1289).
  相似文献   

15.
This paper presents a CMOS image sensor with on-chip compression using an analog two-dimensional discrete cosine transform (2-D DCT) processor and a variable quantization level analog-to-digital converter (ADC). The analog 2-D DCT processor is essentially suitable for the on-sensor image compression, since the analog image sensor signal can be directly processed. The small and low-power nature of the analog design allows us to achieve low-power, low-cost, one-chip digital video cameras. The 8×8-point analog 2-D DCT processor is designed with fully differential switched-capacitor circuits to obtain sufficient precision for video compression purposes. An imager array has a dedicated eight-channel parallel readout scheme for direct encoding with the analog 2-D DCT processor. The variable level quantization after the 2-D DCT can be performed by the ADC at the same time. A prototype CMOS image sensor integrating these core circuits for compression is implemented based on triple-metal double-polysilicon 0.35-μm CMOS technology. Image encoding using the implemented analog 2-D DCT processor to the image captured by the sensor is successfully performed. The maximum peak signal-to-noise ratio (PSNR) is 36.7 dB  相似文献   

16.
阐述了基于矩的离散余弦变换算法和易于VLSI实现的脉动阵列算法结构,然后从软硬件结构划分、电路实现技术等方面探讨离散余弦变换处理机系统的设计思路。最后给出用矩实现的计算框图、电路实现框图以及外围驱动软件的结构设计。  相似文献   

17.
针对多带正交频分复用超宽带(MB-OFDM UWB)系统,提出了一种高吞吐量、混合字长、混合基、4并行数据路径的128点IFFT/FFT处理器结构.该处理器采用具有误差补偿的改进Booth定长乘法器和CSD常量乘法器,有效地提高了精度和减少了硬件的复杂度.通过分析,本方案比混合基多路径延迟反馈(MRMDF)结构减少了49%的乘法器资源,在硬件开销相当的情况下,比双并行数据路径结构减少了30%的存储器资源和提高了33%的吞吐量,使该处理器在精度、硬件开销和速度上做了最好的折衷.在0.18μm COMS工艺下,该处理器的最大工作频率达到300 MHz,吞吐量为1.2 Gsamples/s,满足了吉比特无线个人域网络(WPAN)的要求.  相似文献   

18.
基于DA算法的二维DCT的FPGA实现   总被引:2,自引:0,他引:2  
李莉  宁帆  魏巨升 《现代电子技术》2006,29(10):44-46,49
研究了一种采用现场可编程门阵列(FPGA)实现超高性能二维离散余弦变换(DCT)的方法。在DCT算法结构上利用了变换的可分离性和行列的可分解性采用行列分解的方法将二维DCT转换为2个串行的一维DCT实现,同时采用了基于分布算法(Distributed Arithmetic)的乘法累加结构,从而极大地减少了硬件资源需求,提高了运算速度,使图像处理的实时性得到了大幅提高。最后还给出了FPGA的实现和仿真结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号