共查询到18条相似文献,搜索用时 515 毫秒
1.
2.
提出了一种新的二维DCT和IDCT的FPGA实现结构,采用行列快速算法将二维算法分解为两个一维算法实现,其中每个一维算法采用并行的流水线结构,每一个时钟处理8个数据,大大提高电路的数据吞吐率和运算速度。通过Modelsim仿真工具对该设计进行仿真,证明该算法的功能的正确性,进行一次8*8的分块二维DCT变换仅仅需要16个时钟,满足图像以及视频实时性的要求。 相似文献
3.
4.
二维DCT算法及其精简的VLSI设计 总被引:1,自引:1,他引:0
采用了快速算法,并通过矩阵的变化,得到了一维离散余弦变换(Discrete Cosine Transform,DCT)的一种快速实现,并由此提出一种精简的超大规模集成电路(Very-large-scale integration,VLSI)设计架构.使用了一维DCT的复用技术,带符号数的乘法器设计等技术,实现了二维DCT算法的精简的VLSI设计.实验结果表明,所设计的二维DCT设计有效,并能够获得非常精简的电路设计. 相似文献
5.
介绍了一种支持MPEG2压缩协议,应用于ARM9内核、高速低功耗的二维DCT协处理设计研究.该协处理器利用行列分解法,并行优化实现二维DCT数据结构,明显提高了8×8数据块的处理速度.与此同时,应用改进的CORDIC算法——移位代替乘法并优化移位算法实现一维DCT.仿真结果表明,对于此种一维DCT算法硬件实现,在符合MPEG2精度和ARM9数据传输频率的前提下比文献[2]速度提高了30%,面积却减少了50%.这种协处理器可以在移动多媒体设备的编解码模块中得到广泛应用. 相似文献
6.
7.
8.
9.
10.
11.
New matrix formulation for two-dimensional DCT/IDCT computation and its distributed-memory VLSI implementation 总被引:1,自引:0,他引:1
A direct method for the computation of 2-D DCT/IDCT on a linear-array architecture is presented. The 2-D DCT/IDCT is first converted into its corresponding I-D DCT/IDCT problem through proper input/output index reordering. Then, a new coefficient matrix factorisation is derived, leading to a cascade of several basic computation blocks. Unlike other previously proposed high-speed 2-D N /spl times/ N DCT/IDCT processors that usually require intermediate transpose memory and have computation complexity O(N/sup 3/), the proposed hardware-efficient architecture with distributed memory structure has computation complexity O(N/sup 2/ log/sub 2/ N) and requires only log/sub 2/ N multipliers. The new pipelinable and scalable 2-D DCT/IDCT processor uses storage elements local to the processing elements and thus does not require any address generation hardware or global memory-to-array routing. 相似文献
12.
Shih-Chang Hsia Chin-Feng Tsai Szu-Hong Wang King-Chu Hung 《Journal of Signal Processing Systems》2010,58(2):161-172
This paper presents a cost-effective 2D-DCT processor based on a fast row/column decomposition approach. With a particular
schedule, the processor does not require the transposed memory for 2D-DCT computing. We re-arrange the cosine coefficients
of the first and second 1D-DCT transformations to keep DC-coefficient error free. The new architecture uses state-machines
to generate cosine coefficients rather than ROM table, to save the memory cells and the address generator. For 8 × 8 DCT realization,
the circuit only needs 36 adders without multipliers, and the whole chip uses about 19 k transistors. The chip area is about
4 mm
2
using TSMC 0.35 um CMOS process. The circuit complexity is only 1/3 ~ 1/5 of the conventional DCT chips. 相似文献
13.
The algorithm and architecture of a 2-D systolic array processor for the DCT (discrete cosine transform) are proposed. It is based on the relationship between DCT and cosine DFT and sine DFT. Two systolic architectures of 1-D DCT data and control flow computation are discussed. By use of the main feature of the two systolic 1-D arrays for DCT, a full 2-D systolic DCT array is presented.<> 相似文献
14.
This paper presents a 2-D DCT/IDCT processor chip for high data rate image processing and video coding. It uses a fully pipelined
row–column decomposition method based on two 1-D DCT processors and a transpose buffer based on D-type flip-flops with a double
serial input/output data-flow. The proposed architecture allows the main processing elements and arithmetic units to operate
in parallel at half the frequency of the data input rate. The main characteristics are: high throughput, parallel processing,
reduced internal storage, and maximum efficiency in computational elements. The processor has been implemented using standard
cell design methodology in 0.35 μm CMOS technology. It measures 6.25 mm2 (the core is 3 mm2) and contains a total of 11.7 k gates. The maximum frequency is 300 MHz with a latency of 172 cycles for 2-D DCT and 178
cycles for 2-D IDCT. The computing time of a block is close to 580 ns. It has been designed to meets the demands of IEEE Std.
1,180–1,990 used in different video codecs. The good performance in the computing speed and hardware cost indicate that this
processor is suitable for HDTV applications.
This work was supported by the Spanish Ministry of Science and Technology (TIC2000-1289).
相似文献
相似文献
15.
Kawahito S. Yoshida M. Sasaki M. Umehara K. Miyazaki D. Tadokoro Y. Murata K. Doushou S. Matsuzawa A. 《Solid-State Circuits, IEEE Journal of》1997,32(12):2030-2041
This paper presents a CMOS image sensor with on-chip compression using an analog two-dimensional discrete cosine transform (2-D DCT) processor and a variable quantization level analog-to-digital converter (ADC). The analog 2-D DCT processor is essentially suitable for the on-sensor image compression, since the analog image sensor signal can be directly processed. The small and low-power nature of the analog design allows us to achieve low-power, low-cost, one-chip digital video cameras. The 8×8-point analog 2-D DCT processor is designed with fully differential switched-capacitor circuits to obtain sufficient precision for video compression purposes. An imager array has a dedicated eight-channel parallel readout scheme for direct encoding with the analog 2-D DCT processor. The variable level quantization after the 2-D DCT can be performed by the ADC at the same time. A prototype CMOS image sensor integrating these core circuits for compression is implemented based on triple-metal double-polysilicon 0.35-μm CMOS technology. Image encoding using the implemented analog 2-D DCT processor to the image captured by the sensor is successfully performed. The maximum peak signal-to-noise ratio (PSNR) is 36.7 dB 相似文献
16.
17.
针对多带正交频分复用超宽带(MB-OFDM UWB)系统,提出了一种高吞吐量、混合字长、混合基、4并行数据路径的128点IFFT/FFT处理器结构.该处理器采用具有误差补偿的改进Booth定长乘法器和CSD常量乘法器,有效地提高了精度和减少了硬件的复杂度.通过分析,本方案比混合基多路径延迟反馈(MRMDF)结构减少了49%的乘法器资源,在硬件开销相当的情况下,比双并行数据路径结构减少了30%的存储器资源和提高了33%的吞吐量,使该处理器在精度、硬件开销和速度上做了最好的折衷.在0.18μm COMS工艺下,该处理器的最大工作频率达到300 MHz,吞吐量为1.2 Gsamples/s,满足了吉比特无线个人域网络(WPAN)的要求. 相似文献