共查询到20条相似文献,搜索用时 46 毫秒
1.
设计了一种低功耗的2D DCT/IDCT处理器。为了降低功耗,设计基于行列分解的结构,采用了Loeffler的DCT/IDCT快速算法,并使用了零输入旁路、门控时钟、截断处理等技术,在满足设计需求的基础上降低了系统的功耗。常系数乘法器是该处理器的一个重要部件,文中基于并行乘法器结构设计了一种新型的低功耗常系数乘法器,它采用了CSD编码、Wallace Tree乘法算法,结合采用了截断处理、变数校正的优化技术,使得2D DCT/IDCT处理器整体性能有较大提高。设计的时钟频率为100 MHz,可以满足MPEG2 MP@HL实时解码的应用。采用SMIC0.18μm工艺进行综合,该2D DCT/IDCT处理器的面积为341 212μm2,功耗为14.971 mW。通过与其他结构的2DDCT/IDCT处理器设计分析与比较,在满足MPEG2 MP@HL实时解码应用的同时,实现了较低的功耗。 相似文献
2.
This paper proposes a high performance and low cost inverse discrete cosine transform (IDCT) processor for high definition Television (HDTV) applications by using cyclic convolution and hardwired multipliers. By properly arranging the input sequence, we formulate the one-dimensional (1-D) IDCT into cyclic convolution that is regular and suitable for VLSI implementation. The hardwired multiplier that implements multiplication with IDCT coefficients are first scaled and optimized by using the common sub-expression techniques. Based on these techniques, the data-path in the proposed two-dimensional (2-D) IDCT design costs 7504 gates plus 1024 bits of memory with 100 M pixels/sec throughput according to the cost estimation based on the cell library of COMPASS 0.6 m SPDM CMOS technology. Also, we have verified that the precision analysis of the proposed 2-D 8 × 8 IDCT meets the demands of IEEE Std. 1180-1990. Due to the good performance in the computing speed as well as the hardware cost, the proposed design is compact and suitable for HDTV applications. This design methodology can be applied to forward DCT as well as other transforms like discrete sine transform (DST), discrete Fourier transform (DFT), and discrete Hartley transform (DHT). 相似文献
3.
New matrix formulation for two-dimensional DCT/IDCT computation and its distributed-memory VLSI implementation 总被引:1,自引:0,他引:1
A direct method for the computation of 2-D DCT/IDCT on a linear-array architecture is presented. The 2-D DCT/IDCT is first converted into its corresponding I-D DCT/IDCT problem through proper input/output index reordering. Then, a new coefficient matrix factorisation is derived, leading to a cascade of several basic computation blocks. Unlike other previously proposed high-speed 2-D N /spl times/ N DCT/IDCT processors that usually require intermediate transpose memory and have computation complexity O(N/sup 3/), the proposed hardware-efficient architecture with distributed memory structure has computation complexity O(N/sup 2/ log/sub 2/ N) and requires only log/sub 2/ N multipliers. The new pipelinable and scalable 2-D DCT/IDCT processor uses storage elements local to the processing elements and thus does not require any address generation hardware or global memory-to-array routing. 相似文献
4.
5.
Several parallel, pipelined and folded architectures with different throughput rates are presented for computation of DCT, one of the fundamental operations in image/video coding. This paper begins with a new decomposition algorithm for the 1-D DCT coefficient matrix. Then the 2-D DCT problem is converted into the corresponding 1-D counterpart through a regular index mapping technique. Afterward, depending on the trade-off between hardware complexity and speed performance, the derived decomposition algorithm is transformed into different parallel-pipelined and folded architectures that realize the butterfly operations and the post-processing operations. Compared to other DCT processor, our proposed parallel-pipelined architectures, without any intermediate transpose memory, have the features of modularity, regularity, locality, scalability, and pipelinability, with arithmetic hardware cost proportional to the logarithm of the transform length. 相似文献
6.
提出了一种基于离散余弦变换(DCT)和二维最大边缘准则(2DMMC)的2DDM特征提取算法,证明了2DMMC可以直接应用于DCT域,利用欧氏距离测度进行分类的结果与在空域中进行得到的结果完全相同。2DMMC方法可直接应用于基于DCT压缩的JPEG格式的图像。在ORL和Yale人脸数据库上的实验结果表明,在空域2DMMC的识别率高于2DPCA和2DLDA,2DDM的识别率又高于2DMMC,而且2DDM的耗时要低于2DMMC。 相似文献
7.
离散余弦变换(DCT)是数字图像处理等许多领域的重要数学工具.本文通过一种新的傅立叶分析技术——算术傅立叶变换(AFT)来计算DCT.本文对偶函数的AFT进行了改进.改进的AFT算法不但把AFT所需样本点数减少了一半,从而使所需加法计算量减少了一半,更重要的是它建立起AFT和DCT的直接联系,因而提供了适合用于计算DCT的AFT算法.本文推导了用改进的AFT计算DCT的算法并对算法进行了简要的分析.这种算法的乘法量仅为O(N),并且具有公式一致,结构简单,易于并行,适合VLSI设计等特点,为DCT的快速计算开辟了新的途径. 相似文献
8.
9.
Mihai Sima Sorin Cotofaná Jos T. J. Van Eijndhoven Stamatis Vassiliadis Kees Vissers 《The Journal of VLSI Signal Processing》2005,39(3):195-212
This paper presents a TriMedia processor extended with an IDCT reconfigurable design, and assesses the performance gain such an extension has when performing MPEG-2 decoding. We first propose the skeleton of an extension of the TriMedia architecture, which consists of a Field-Programmable Gate Array (FPGA)-based Reconfigurable Functional Unit (RFU), a Configuration Unit managing the reconfiguration of the RFU, and their associated instructions. Then, we address the computation of the 8 × 8 (2-D) IDCT on such extended TriMedia and propose a scheme to implement the 1-D IDCT operation on the RFU. When mapped on an ACEX EP1K100 FPGA from Altera, the proposed 1-D IDCT exhibits a latency of 16 and a recovery of 2 TriMedia@200 MHz cycles, and occupies 45% of the logic cells of the device. By configuring the 1-D IDCT on the RFU at application launch-time, the IEEE-compliant 2-D IDCT can be computed with the throughput of 1/32 IDCT/cycle. This figure translates to an improvement over the standard TriMedia of more than 40% in terms of computing time when 2-D IDCT is carried out in the framework of MPEG-2 decoding. Finally, the proposed reconfigurable IDCT is compared to a number of existing designs.Mihai Sima was born in Bucharest, Romania. He received the MS degree in Electrical Engineering from Politehnica University of Bucharest, and the Ph.D. degree in Electrical Engineering from Delft University of Technology, The Netherlands. He had been with the Microelectronics Company in Bucharest for 3 years, where he was involved in instrumentation electronics for integrated circuit testing. Subsequently, he joined the Telecommunications Department of Politehnica University of Bucharest, where he had been involved in digital signal processing and speech recognition for 6 years. More recently, he had been with the Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, where he worked on reconfigurable architectures for mediaprocessing domain. He is currently an assistant professor with the Department of Electrical and Computer Engineering, University of Victoria, B.C., Canada. His research interests include computer architecture, reconfigurable computing, embedded systems, digital signal processing, and speech recognition.Sorin D. Coofan was born in Mizil, Romania. He received the MS degree in Computer Science from the Politehnica University of Bucharest, Romania, and the Ph.D. degree in Electrical Engineering from Delft University of Technology, The Netherlands. He had worked with the Research & Development Institute for Electronic Components (ICCE) in Bucharest for a decade, being involved in structured design of digital systems, design rule checking of ICs layout, logic and mixed-mode simulation of electronic circuits, testability analysis, and image processing. He is currently an associate professor with the Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, The Netherlands. His research interests include computer arithmetic, parallel architectures, embedded systems, reconfigurable computing, nano-electronics, neural networks, computational geometry, and computer aided design.Jos T.J. van Eijndhoven was born in Roosendaal, The Netherlands. He studied Electrical Engineering at the Eindhoven University of Technology, The Netherlands, obtaining the M.Sc. and Ph.D. degrees in 1981 and 1984, respectively, for a work on piecewise linear circuit simulation. Then, he became a senior research member in the design automation group of the Eindhoven University of Technology. In 1986 he spent a sabbatical period at the IBM Thomas J. Watson Research Laboratory, Yorktown Heights, New York, for research on high level synthesis. In 1998 he joined Philips Research Laboratories in Eindhoven, The Netherlands, to work on the architectural design of programmable multimedia hardware and the associated mapping of media processing applications.Stamatis Vassiliadis was born in Manolates, Samos, Greece. He is a professor with the Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, The Netherlands. He has also served in the faculties of Cornell University, Ithaca, NY, and the State University of New York (S.U.N.Y.), Binghamton, NY.He hadworked for a decade with IBM in the AdvancedWorkstations and Systems laboratory in Austin TX, the Mid-Hudson Valley Laboratory in Poughkeepsie, NY, and the Glendale Laboratory in Endicott, NY. In IBM he was involved in a number of projects regarding computer design, organizations, and architectures and the leadership to advanced research projects. A number of his design and implementation proposals have been implemented in commerciallyavailable systems and processors including the IBM 9370 model 60 computer system, the IBM POWER II, the IBM AS/400 Models 400, 500, and 510, Server Models 40S and 50S, the IBM AS/400 Advanced 36, and the IBM S/390 G4 and G5 computer systems. For his work, he received numerous awards including 23 levels of Publication Achievement Awards, 15 levels of Invention Achievement Awards and an Outstanding Innovation Award for Engineering/Scientific Hardware Design in 1989. In 1990 he has been awarded the highest number of USA patents in IBM, six of his 70 USA patents being rated with the highest patent ranking in IBM.Kees A. Vissers graduated the Delft University of Technology, receiving his M.Sc. in 1980. He started directly with Philips Research Laboratories in Eindhoven where he was involved in highlevel simulation and high-level synthesis. He had been heading the research on hardware/software co-design and system level design for many years, and had a significant contribution to the TriMedia VLIW processor. From 1987 till 1988 he was a visiting researcher at Carnegie Mellon University, Pittsburgh, Pennsylvania, with the group of Don Thomas. He is currently a Research Fellow with University of California at Berkeley, Department of Electrical Engineering and Computer Sciences. His research interests include video processing, embedded media processing systems, and reconfigurable computing. 相似文献
10.
11.
一种动态精度匹配的面积优化2-D DCT/IDCT的实现 总被引:1,自引:0,他引:1
提出了一种JPEG标准推荐的2—DDCT/IDCT的改进型Loeffler算法的ASIC实现。该设计采用硬件复用的方法,在正向和反向变换过程中使用同一运算电路,达到了面积优化的目的;并对输入数据进行系数预判,在特定输入情况下,有效提高了处理速度和降低功耗;还根据JPEG体系结构,在DCT变换和量化器之间建立劝态的精度匹配,保证了不同压缩比下的图像质量和功耗效率。该电路应用于140万像素数码相机的JPEG图像处理ASIC芯片中,已成功通过了FPGA验证和流片测试。 相似文献
12.
Kawahito S. Yoshida M. Sasaki M. Umehara K. Miyazaki D. Tadokoro Y. Murata K. Doushou S. Matsuzawa A. 《Solid-State Circuits, IEEE Journal of》1997,32(12):2030-2041
This paper presents a CMOS image sensor with on-chip compression using an analog two-dimensional discrete cosine transform (2-D DCT) processor and a variable quantization level analog-to-digital converter (ADC). The analog 2-D DCT processor is essentially suitable for the on-sensor image compression, since the analog image sensor signal can be directly processed. The small and low-power nature of the analog design allows us to achieve low-power, low-cost, one-chip digital video cameras. The 8×8-point analog 2-D DCT processor is designed with fully differential switched-capacitor circuits to obtain sufficient precision for video compression purposes. An imager array has a dedicated eight-channel parallel readout scheme for direct encoding with the analog 2-D DCT processor. The variable level quantization after the 2-D DCT can be performed by the ADC at the same time. A prototype CMOS image sensor integrating these core circuits for compression is implemented based on triple-metal double-polysilicon 0.35-μm CMOS technology. Image encoding using the implemented analog 2-D DCT processor to the image captured by the sensor is successfully performed. The maximum peak signal-to-noise ratio (PSNR) is 36.7 dB 相似文献
13.
提出了一种新的二维DCT和IDCT的FPGA实现结构,采用行列快速算法将二维算法分解为两个一维算法实现,其中每个一维算法采用并行的流水线结构,每一个时钟处理8个数据,大大提高电路的数据吞吐率和运算速度。通过Modelsim仿真工具对该设计进行仿真,证明该算法的功能的正确性,进行一次8*8的分块二维DCT变换仅仅需要16个时钟,满足图像以及视频实时性的要求。 相似文献
14.
A new algorithm to compute the DCT and its inverse 总被引:2,自引:0,他引:2
A novel algorithm to convert the discrete cosine transform (DCT) to skew-circular convolutions is presented. The motivation for developing such an algorithm is the fact that VLSI implementation of distributed arithmetic is very efficient for computing convolutions. It is also shown that the inverse DCT (IDCT) can be computed using the same building blocks which are used for computing the DCT. A DCT/IDCT processor can be designed to compute either the DCT or the IDCT depending on a 1-b control signal 相似文献
15.
离散余弦变换已成为图像压缩中一标准技术,本文给出了基于DA算法的二维离散余弦逆变换(2-DIDCT)系统的设计。在设计过程中根据实际情况提出了一种改进算法,通过采用此改进DA算法,整个2-DIDCT系统在提高速度的同时可大大减少面积。设计采用自顶向下设计方法,用VHDL进行描述,整个系统在SYMOPSYS工具上进行设计及仿真,最终综合到门级电路。 相似文献
16.
In order to solve the limitations of the digital video watermarking algorithm, this paper proposes a new robust video watermarking algorithm using combining discrete cosine transform (DCT) and discrete wavelet transform (DWT) techniques. First of all, the video frames are randomly selected and then the DCT algorithm is applied to the selected video frames. After that, the first column of the selected video frames is scrambled using the Arnold algorithm. Furthermore, every column with 4 direct current (DC) coefficients is reshaped and transformed into four different sub-bands using the DWT technique. Next, the watermark is embedded into the approximation (LL) sub-band. The proposed algorithm is easy to carry out because it provides random frames with no special requirements for video frames. The experiment results indicate that this algorithm can resist against different kinds of watermarking attacks, such as the Gaussian filter attack and sharpen attack. In addition, it also illustrates that the proposed algorithm has a better result than some other watermarking algorithms. 相似文献
17.
18.
In this paper, a new algorithm for the fast computation of a 2-D discrete cosine transform (DCT) is presented. It is shown that the N×N DCT, where N = 2m, can be computed using only N 1-D DCT's and additions, instead of using 2N 1-D DCT's as in the conventional row-column approach. Hence the total number of multiplications for the proposed algorithm is only half of that required for the row-column approach, and is also less than that of most of other fast algorithms, while the number of additions is almost comparable to that of others. 相似文献
19.
Che-Hong Chen Bin-Da Liu Jar-Ferr Yang 《IEEE transactions on circuits and systems. I, Regular papers》2004,51(10):2017-2030
In this paper, new recursive structures for computing radix-r two-dimensional (2-D) discrete cosine transform (DCT) and 2-D inverse DCT (IDCT) are proposed. The 2-D DCT/IDCT are first decomposed into cosine-cosine and sine-sine transforms. Based on indexes of transform bases, the regular pre-addition preprocess is established and the recursive structures for 2-D DCT/IDCT, which can be realized in a second-order infinite-impulse response (IIR) filter, are derived without involving any transposition procedure. For computation of 2-D DCT/IDCT, the recursive loops of the proposed structures are less than that of one-dimensional DCT/IDCT recursive structures, which require data transposition to achieve the so-called row-column approach. With advantages of fewer recursive loops and no transposition, the proposed recursive structures achieve more accurate results and less power consumption than the existed methods. The regular and modular properties are suitable for very large-scale integration (VLSI) implementation. By using similar procedures, the recursive structures for 2-D DST and 2-D IDST are also proposed. 相似文献
20.
High efficiency video coding (HEVC) transform algorithm for residual coding uses 2-dimensional (2D) 4×4 transforms with higher precision than H.264's 4×4 transforms, resulting in increased hardware complexity. In this paper, we present a shared architecture that can compute the 4×4 forward discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) of HEVC using a new mapping scheme in the video processor array structure. The architecture is implemented with only adders and shifts to an area-efficient design. The proposed architecture is synthesized using ISE14.7 and implemented using the BEE4 platform with the Virtex-6 FF1759 LX550T field programmable gate array (FPGA). The result shows that the video processor array structure achieves a maximum operation frequency of 165.2 MHz. The architecture and its implementation are presented in this paper to demonstrate its programmable and high performance. 相似文献