首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 15 毫秒
1.
高效的H.264并行编码算法   总被引:3,自引:1,他引:3       下载免费PDF全文
孙书为  陈书明 《电子学报》2009,37(2):357-361
 CABAC是H.264/AVC视频压缩标准主要档次中采用的熵编码机制,结合RDO模式选择技术,可以降低20%的编码码率,但是编码器计算复杂度却同时大大增加.对算法进行并行化是有效加快编码速度的方法,但是,由于CABAC具有自适应编码的特点和RDO模式选择对熵编码的使用,使得顺序编码的宏块之间存在着严格的数据相关性,限制了并行编码算法的开发.本文结合基于宏块区域划分的数据级并行编码机制MBRP和码率估计技术,为采用CABAC熵编码机制的H.264编码算法提供了一种高效的并行编码方案:将H.264编码算法划分为模式选择和码流生成两个部分,使之构成典型的生产者-消费者关系;将RDO模式选择中的CABAC替换为码率估计,去除模式选择过程中因CABAC导致的严格数据相关性;对模式选择部分采用MBRP并行机制;码流生成部分由单独的处理器完成,并和模式选择部分实现流水化并行处理.通过4处理器系统模拟器进行实验,发现在保持视频压缩性能几乎不变的情况下,该并行算法的加速比可以达到4.7.  相似文献   

2.
The hardware implementation of the intra prediction described in this paper allows the H.264/AVC encoder to achieve optimal compression efficiency in real-time conditions. The architecture has some features that distinguish it from other solutions described in literature. Firstly, the architecture supports all intra prediction modes defined in High Profile of the H.264/AVC standard for all chroma formats. Secondly, the architecture can generate predictions for several quantization parameters. Thirdly, the hardware cost is reduced as the same resources are used to compute prediction samples for all the modes. Fourthly, the high sample-generation rate enables the encoder to achieve high throughputs. Fifthly, 4?×?4 block reordering and interleaving with other modes minimize the impact of the long-delay reconstruction loop on the encoder throughput. The architecture is verified against the JM.12 reference model and within the real-time FPGA hardware encoder. The synthesis results show that the design can operate at 100 MHz and 200 MHz for FPGA Aria II and 0.13 μm TSMC technology, respectively. These frequencies allow the encoder to support 720p and 1080p video at 30 fps.  相似文献   

3.
H.264视频编码标准中引入了1/4像素精度插值算法,大大提高了压缩效率,但同时使运算复杂度增加、存储带宽增大。针对以上问题,从运动估计的角度出发,采用一步插值法和数据复用技术,可使带宽减少26%,处理周期可减少45%;设计了相应的硬件结构:采用了5级流水线实现一步插值算法,通过输入缓冲单元实现了参考数据的复用;针对插值过程中产生的大量数据,采用乒乓操作结构,保证数据及时传递。该结构可以显著降低带宽,提高吞吐率,完全可以应用于实时编码器中。  相似文献   

4.
基于TMS320CDSC21的MPEG4编码器二维DCT变换的实现   总被引:1,自引:0,他引:1  
首先介绍了二维DCT变换算法,然后描述了实现该算法的TI公司的TMS320CDSC21芯片的结构特点.最后给出了一种程序编写简单、结构清晰、执行效率高的实现二维DCT变换的设计方法.  相似文献   

5.
袁基炜  国辉 《电视技术》2007,31(8):15-17
提出了一种优化参考帧内存的新方法:每个参考帧的亮度参考图像仅包含了图像内的整数像素和扩展边界中的整数像素;在运动矢量搜索时先利用亮度参考图像找到整数运动矢量,而后由整数运动矢量搜索分数运动矢量,在每一次的搜索过程中根据搜索位置的分数运动矢量在亮度参考图像上找到对应的整数像素子块,对该子块进行插值运算获取所需的参考子块并进一步处理而获取对应代价,最后根据代价最小准则获取最佳运动矢量.试验证明这种优化方法能大大减小内存,且保证图像质量和码率几乎没有变化.  相似文献   

6.
The H.264/AVC video coding standard features diverse computational hot spots that need to be accelerated to cope with the significantly increased complexity compared to previous standards. In this paper, we propose an optimized application structure (i.e. the arrangement of functional components of an application determining the data flow properties) for the H.264 encoder which is suitable for application-specific and reconfigurable hardware platforms. Our proposed application structural optimization for the computational reduction of the Motion Compensated Interpolation is independent of the actual hardware platform that is used for execution. For a MIPS processor we achieve an average speedup of approximately 60× for Motion Compensated Interpolation. Our proposed application structure reduces the overhead for Reconfigurable Platforms by distributing the actual hardware requirements amongst the functional blocks. This increases the amount of available reconfigurable hardware per Special Instruction (within a functional block) which leads to a 2.84× performance improvement of the complete encoder when compared to a Benchmark Application with standard optimizations. We evaluate our application structure by means of four different hardware platforms.  相似文献   

7.
H.264编码器中的帧内4x4预测部分具有严重的数据依赖性,它的硬件化设计很难采用流水线实现,从而导致关键路径很长,硬件利用率很低,成为H.264编码器设计中的一个瓶颈。针对这个问题, 在不减少预测模式和不增加系统资源的的前提下,本文提出了一种新的结构,它通过利用原始像素进行模式判决和利用重构像素进行帧内预测的方法,可以使帧内预测与重构循环完全流水线实现,基本上达到了100%的硬件利用率,而且没有明显的PSNR的损失。本文所提出的硬件结构可在215个时钟周期内完成一个宏块的帧内4x4预测。用SMIC 0.13um工艺库综合,结果显示该结构最高可运行在250M,面积约为116K门,可支持4096x2160@30fps视频序列的实时编码。  相似文献   

8.
9.
In addition to coding efficiency, the scalable extension of H.264/AVC provides good functionality for video adaptation in heterogeneous environments. Fine grain scalability (FGS) is a technique to extract video bitstream at the finest quality level under the given bandwidth. In this paper, an architecture of FGS encoder with low external memory bandwidth and low hardware cost is proposed. Up to 99% of bandwidth reduction can be attained by the proposed scan bucket algorithm, early context modeling with context reduction, and first scan pre-encoding. The area-efficient hardware architecture is implemented by layer-wise hardware reuse. Besides, three design strategies for enhancement layer coder are explored so that the trade-off between external memory bandwidth and silicon area is allowed. The proposed hardware architecture can real-time encode HDTV 1920×1080 video with two FGS enhancement layers at 200 MHz working frequency, or HDTV 1280×720 video with three FGS enhancement layers at 130 MHz working frequency.  相似文献   

10.
This paper desribes an objective evaluation for coding performance of an interframe encoder (NETEC-22H). Also described is the coding performance improvement by an adaptive bit sharing multiplexer (ABS-MUX) in which transmission bit rate is dynamically allocated to several channels. Measurements made for actual broadcast TV programs over a time of 36 h show that an SNR of higher than 50 dB unweighted is obtained by this coding equipment for 99 percent of the time for broadcast TV programs at the transmission bit rate of 30 Mbits/s and for 93 percent of the time at 20 Mbits/s. The residual 1 percent at 30 Mbits/s or 7 percent at 20 Mbits/s is transmitted with a slightly lower SNR. The picture quality difference between the 20 and 30 Mbit/s transmission is about 6 dB in SNR on the average. It is also shown that a three-channel ABS-MUX (20 Mbits/s per channel on the average) reduces probability of coarse quantization by a factor of 5-10 compared with the fixed bit rate transmission at 20 Mbits/s.  相似文献   

11.
PCI-Express中8b/10b编码解码器的设计与实现   总被引:4,自引:3,他引:1  
文章在研究了8b/10b编码原理的基础上.采用FPGA设计并实现了PCI-Express总线控制器中的8b/10b编码解码器。8b/10b编码是一种面向字节的二进制传输代码。这种代码特别适合于高速串行总线的数据传输。这种编码编码的基本特性是保证DC平衡。采用8b/10b编码方式,可使得发送的“0”、“1”数量保持基本一致。连续的…1或“0”不超过5位,从而保证信号DC平衡。8b/10b编码器可以通过一个5b/6b编码器和一个3b/4b编码器来实现。  相似文献   

12.
A common method for selecting the best prediction mode based on block matching algorithm is to compare, for each source block, the associated distortions among the available prediction candidates. The human visual perception is sensitive to luminance contrast rather than absolute luminance values. In fact, the human eyes ability to detect the magnitude difference between an object and its background depends on the background luminance average value. The Perceptually Weighted Distortion (PWD) is a new distortion measure that can produce better image quality. In this paper, we propose to add a new feature to the PWD by introducing another diagonal component that yields to a significant quality improvement. The enhanced PWD metric actually outperforms the original PWD and the SAD metric, according to the experimental results, especially in the aspect of reducing block artifacts. An increase in terms of implementation complexity will be noticed as a result of this contribution. Therefore, optimized implementation of the Enhanced PWD exploiting the C64 DSP-Core assets will be presented. In fact, Standard Assembly (SA) is used to implement the different Enhanced PWD functions in order to exploit efficiently the C64 internal architecture and resources. Experimental results show more than 85% improvement in terms of cycle cost compared to C code.  相似文献   

13.
In this paper, we propose an architecture for H.264/AVC fast intra-prediction-mode decision making in high resolution real-time applications. Intra-prediction-mode decision making requires many computations of H.264/AVC video coding, and also extra time for mode generation for intra prediction mode decisions. Hence, there exists a bottleneck in the execution of high resolution real-time applications. To improve the operation of intra prediction mode decision, we use an algorithm which, based on the edge information of an object, will reduce estimations of mode predictions by 66%; with negligible loss of video quality and a small increase in bit-rate of video stream. We propose a low cost architecture, with gate counts reduced by 50% compared with former design. The total gate count is 86,671 and the maximum operating frequency is 250 MHz using TSMC 0.18 μm cell-based technology. The experimental results show our design is a strong competitor with most modern high resolution, real-time video processing.  相似文献   

14.
The increasing importance of safety and availability of power plant units demands more detailed and proven reliability data for plant components than operational experience could provide till now. In the Federal Republic of Germany an extensive collection of field data from a conventional power plant has been compiled since 1972. The authors give a survey on the information system processing these field data, in order to establish empirically tested reliability data for plant components.  相似文献   

15.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号