首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 171 毫秒
1.
洪琪  曹伟  童家榕 《电子学报》2011,39(5):1059-1063
提出了一种新的支持MPEG-4 AVC/H.264标准4×4整数变换的动态可重构结构.首先,针对4×4正反变换分别推导了两个新的二维直接信号流图.进而设计了一个面向HDTV应用的动态可重构多变换结构.该结构无需转置寄存器且计算单元仅需16个加法器(减法器).采用0.18μm CMOS工艺实现了该电路结构.结果表明,最高...  相似文献   

2.
在视频信号的编解码流程中,离散余弦变换(DCT)是一个至关重要的环节,其决定了视频压缩的质量和效率。针对88尺寸的2维离散余弦变换,该文提出一种基于粗粒度可重构阵列结构(Coarse-Grained Reconfigurable Array, CGRA)的硬件电路结构。利用粗粒度可重构阵列的可重配置的特性,实现在单一平台支持多个视频压缩编码标准的88 2维离散余弦变换。实验结果显示,这种结构每个时钟周期可以并行处理8个像素,吞吐率最高可达1.157109像素/s。与已有结构相比,设计效率和功耗效率最高可分别提升4.33倍和12.3倍,并能够以最高30帧/s的帧率解码尺寸为40962048,格式为4:2:0的视频序列。  相似文献   

3.
2013年1月HEVC(High Efficient Video Coding)被ITU-T和ISO/IEC正式确立为新一代视频编码国际标准.为了实现更高的压缩效率,HEVC使用了多项新技术.在空间域变换方面,HEVC支持从4×4到32×32的可变尺寸的IDCT变换,同时根据模式进行4×4IDCT和IDST变换的选择.由此提出了一种HEVC IDCT/IDST变换架构.采用基于流水的数据流调度策略和系数矩阵优化方案,提升了硬件效率和接口带宽利用率.采用65nm工艺库综合后,一维IDCT/IDST单元的等效门数约为40K,最高工作频率为500MHz,与现有设计相比可以实现30%以上的硬件资源减少和60%以上的吞吐率效率提升.仿真结果显示该结构可以实现对4k×2k@30f/s视频的IDCT/IDST处理.  相似文献   

4.
一种快速高效的二维一级小波变换的硬件实现   总被引:2,自引:1,他引:1  
提出了一种针对9/7小波滤波器的二维一级小波变换的硬件平台,整体结构采用流水方式实现,数据分组输入,列变换采用多个小波变换单元,行变换模块为可重构硬件结构,行列变换之间不需要片上存储器。与已有结构相比,该结构可以通过更少的硬件资源消耗获得更高的处理速度。  相似文献   

5.
提出了一种应用于JPEG2000静态图像编码系统的二维离散小波变换(2D-DWT)单元的FPGA实现.分析了2D-DWT算法的特点,提出了一种直接进行二维小波变换的高速算法,克服了传统二维小波变换算法对存储器的频繁访问的缺点.同时,硬件结构具有较高的并行度和吞吐率;运用流水线技术,进一步提高了系统性能,每个时钟能输出4个小波系数.该结构对于N×N的图像,处理速度仅需要(N/2)2个时钟周期.设计经过FPGA验证,可用于实时图像压缩系统中.  相似文献   

6.
二维离散余弦(DCT)在H.264视频编码中承担者信号从时域到频域变换的作用。在现场可编程逻辑门阵列(FPGA)上设计了高效的采用流水线结构的H.264DCT硬件电路。首先,把二维4×4DCT变换转换成二次一维DCT变换;其次,DCT变换之间加一个两端口的RAM,以实现数列的转置;最后,在顶层设计一个有限状态机控制整个流程。该设计采用较少的资源实现了较好的功能,获得了可靠的实验结果。  相似文献   

7.
设计实现了一款低功耗小面积的JPEG图像压缩芯片.该压缩芯片采用4×4分块方式,每个4×4块的一维DCT运算只需要1次乘法.二维DCT中间转置结构采用一种新颖的实现方式,与传统的实现方式相比,减少了37.5%的延时和51%的面积.设计的电路采用UMC18工艺流片实现,芯片的面积和功耗分别为0.46 mm2和0.9 mW.测试结果显示,该图像压缩芯片可以在实现较高压缩比(大于80%)的同时获得较好的图像质量(PSNR大于30 dB).  相似文献   

8.
陆晓凤  刘锋  佟冬  王克义 《电子学报》2011,39(5):1072-1076
本文针对H.264 Fidelity Range Extensions(FRExt,High Profile)解码过程中扩展的所有变换,采用二维矩阵分解和基于矩阵运算提取公共因子的操作,利用通用运算单元来设计高效的可重构VLSI结构.该结构不但节省面积(可重构变换结构只消耗了4807门电路),并且具有高性能(采用TSM...  相似文献   

9.
采用分布式PIN开关和并发双波段阻抗变换网络,实现了一种可重构高效率多波段功率放大器。与其他可重构放大器相比,该功率放大器降低了输出匹配电路的设计复杂度和开关对匹配电路的影响,有效节约了频谱资源,电路结构简单。双扇形开路微带线的使用拓展了高输入阻抗偏置电路的带宽。在进行匹配电路设计时,考虑了晶体管的寄生参数。仿真结果表明,该功率放大器具有高输出效率和良好的增益平坦度,验证了该方案的可行性。  相似文献   

10.
H.264整数DCT变换算法有助于减少计算复杂度,提高编码速度,进一步提高视频或图像的压缩效率。分析H.264整数DCT变换的快速算法及其实现原理,并提出一种用来具体实现一个4×4块的DCT变换的结构;同时给出用VHDL语言实现4×4块DCT变换的内部模块的源代码和仿真波形。仿真结果表明用该算法可快速实现一个4×4块的整数DCT变换。提出一种切实可行的用于H.264整数DCT变换的结构,该结构可完全用硬件电路快速实现;对于用FPGA实现H.264整数DCT变换做了一次实践性的尝试,对深入理解H.264整数DCT变换及其算法的具体实现具有一定的实践意义。  相似文献   

11.
High efficiency video coding (HEVC) transform algorithm for residual coding uses 2-dimensional (2D) 4×4 transforms with higher precision than H.264's 4×4 transforms, resulting in increased hardware complexity. In this paper, we present a shared architecture that can compute the 4×4 forward discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) of HEVC using a new mapping scheme in the video processor array structure. The architecture is implemented with only adders and shifts to an area-efficient design. The proposed architecture is synthesized using ISE14.7 and implemented using the BEE4 platform with the Virtex-6 FF1759 LX550T field programmable gate array (FPGA). The result shows that the video processor array structure achieves a maximum operation frequency of 165.2 MHz. The architecture and its implementation are presented in this paper to demonstrate its programmable and high performance.  相似文献   

12.
Video compression performance of High Efficiency Video Coding (HEVC) is about twice of H.264/AVC video compression standard. The improvement in coding efficiency in HEVC is achieved by considerable increase in the computational load compared to H.264/AVC which is substantially very computational intensive. One of the units in HEVC which has changed considerably compared to H.264/AVC is Integer Discrete Cosine Transform (IDCT) unit. IDCT in HEVC standard includes 32 × 32, 16 × 16, 8 × 8 and 4 × 4 transforms. In this paper, a hardware solution for implementing the entire inverse IDCTs in HEVC decoder is proposed. The proposed hardware has a resource-sharing pipelined architecture. As a result, the hardware resources and computation time for implementing inverse IDCTs in HEVC decoder are reduced. Synthesis results by using NanGate OpenPDK 45 nm library indicate that the proposed hardware can achieve 222 MHz clock rate and can achieve real-time decoding of 4096 × 3072 video sequences with 70 fps.  相似文献   

13.
In this paper, the fast one-dimensional (1-D) algorithms and their hardware-sharing designs for the 1-D 2 $times$ 2, 4 $times$ 4, and 8 $times$ 8 inverse transforms of H.264/AVC and the 1-D 8 $times$ 8 inverse transform of AVS are proposed with the low hardware cost, especially for the multiple decoding applications in China. By sharing the hardware, the proposed 1-D hardware sharing architecture is realized by adding the offset computations, and it is implemented with the pipelined architecture. Thus, the hardware cost of the proposed sharing architecture is smaller than that of the individual and separate designs. With regular modularity, the proposed sharing architecture is suitable to achieve H.264/AVC and AVS signal processing by VLSI implementations.   相似文献   

14.
A Highly Parallel Joint VLSI Architecture for Transforms in H.264/AVC   总被引:1,自引:0,他引:1  
In H.264/AVC, the concept of adapting the transform size to the block size of motion-compensated prediction residue has proven to be an important coding tool. This paper presents highly parallel joint circuit architecture for 8 × 8 and 4 × 4 adaptive block-size transforms in H.264/AVC. By decomposing the 8 × 8 transform to basic 4 × 4 transforms, a unified architecture is designed for both 8 × 8 and 4 × 4 transform and the transform data-path can be efficiently reused for six kinds of transforms. i.e., 8 × 8 forward, 8 × 8 inverse, 4 × 4 forward, 4 × 4 inverse, forward-Hadamard, inverse-Hadamard transforms. Linear shift mapping is applied on the memory buffer to support parallel access both in row and column directions which eliminates the need for a transpose circuit. For reusable and configurable transform data-path, a multiple-stage pipeline is designed to reduce the critical path length and increase throughput. The design is implemented under UMC 0.18 um technology at 200 MHz with 13.651 K logic gates, which can support 1,920 × 1,088 30 fps H.264/AVC HDTV decoder.
Yu LiEmail:
  相似文献   

15.
Two-dimensional discrete cosine transforms are used in the core transformations in all profiles of the H.264/Advanced video coding (AVC) standard. In this paper, implementing the resource sharing of high throughput 4 × 4 and 8 × 8 forward and inverse integer transforms for high definition H.264 is presented. It is shown that the 4 × 4 forward/inverse transform can be obtained from 8 × 8 forward/inverse transform using selective data input and data arrangement at intermediate stages. Fast 8 × 8 forward and inverse transform is implemented using matrix decomposition and matrix operation such as Kronecker product and direct sum. The proposed implementation does not require any transpose memory and has a dual clocked pipeline structure. Compared with existing designs, the gate count is reduced by 27.7% in the proposed design. The maximum operating frequency of the proposed system is approx. 1.3 GHz, while the throughput is 7 G and 18.7 G pixels/s for 4 × 4 and 8 × 8 forward integer transforms, respectively. The proposed design can be used for real time H.264/AVC high definition processing owing to its high throughput and low hardware cost.  相似文献   

16.
In this article, we present the implementation of high throughput two-dimensional (2-D) 8?×?8 forward and inverse integer DCT transform for H.264. Using matrix decomposition and matrix operation, such as the Kronecker product and direct sum, the forward and inverse integer transform can be represented using simple addition operations. The dual clocked pipelined structure of the proposed implementation uses non-floating point adders and does not require any transpose memory. Hardware synthesis shows that the maximum operating frequency of the proposed pipelined architecture is 1.31?GHz, which achieves 21.05 Gpixels/s throughput rate with the hardware cost of 42932 gates. High throughput and low hardware makes the proposed design useful for real time H.264/AVC high definition processing.  相似文献   

17.
In this paper, the novel two-dimensional (2-D) fast algorithm for realization of 4 /spl times/ 4 forward integer transform in H.264 is proposed. Based on matrix operations with Kronecker product and direct sum, the efficient fast 2-D 4 /spl times/ 4 forward integer transform can be derived from the proposed one-dimensional fast 4 /spl times/ 4 forward integer transform through matrix decompositions. The proposed fast 2-D 4 /spl times/ 4 forward integer transform design doesn't need transpose memory for direct parallel pipelined architecture. The fast 2-D 4 /spl times/ 4 forward integer transform requires fewer latency delays than the state-of-the-art methods. With regular modularity, the proposed fast algorithm is suitable for VLSI implementation to achieve real-time H.264/advanced video coding (AVC) signal processing.  相似文献   

18.
提出一种支持H.264 High Profile 4.1和AVS JiZhun Profile 6.0的多标准宏块预测与边界滤波强度计算的VLSI架构,该架构根据解码器的算法特点,实现了H.264和AVS标准中控制占优的帧内模式预测、帧间运动矢量预测以及边界滤波强度计算算法,能应用于当前的可重构多媒体系统.对该架构进行实现后,采用TSMC 65nm工艺综合,工作频率可达到312 MHz,解码一个H.264和AVS宏块最大分别消耗351和189个时钟周期,能够满足H.264和AVS高清(1080p)实时处理的需求.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号