首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 109 毫秒
1.
王尧  汤心溢 《红外技术》2020,42(4):335-339,347
本文基于H.265/HEVC视频编码标准,实现了CABAC编码中二进制算术编码器常规编码模式下的一种硬件流水线结构,根据算法特性设计并优化了编码器的硬件架构,将概率状态数据储存在SRAM中,并使用查找表优化概率估计更新运算;对编码数据进行打包处理,简化概率估计更新带来的计算,以优化视频数据流编码速度;二进制算术编码采用多级流水线结构,支持四路并行编码。仿真结果表明,本文的硬件CABAC二进制算术编码器平均每时钟周期可以完成4个bin的编码,符合较高帧率的1080p视频实时编码要求。  相似文献   

2.
提出一种二值化建模器的语法元素拆分合并策略及对应的硬件结构,能够使二元位串长度的分布更集中更平坦,从而提高了H.264/AVC中基于上下文的自适应二进制算术编码器(CABAC)的吞吐率和硬件资源利用率。本文使用生产者/消费者模型为CABAC编码器建模进行数据统计和分析,发现通过合并特定语法元素使二元位串平均长度增加可以提高CABAC编码器的吞吐率,而通过分割特定语法元素使二元位串长度的峰值更小变化更均匀可以增加硬件资源的利用率。仿真结果显示该设计平均每节拍能够处理1.89个二元位,该设计在0.13μm CMOS工艺下综合工作频率可达303MHz,相应资源占用为22.26K门。  相似文献   

3.
H.264中并行化的CAVLC编码器架构设计   总被引:1,自引:0,他引:1  
针对最新的视频压缩标准H.264/AVC,提出一种并行化的CAVLC编码器架构.该编码器并行化处理CAVLC中的语法元素,可减少编码量化后的变换系数的时钟周期.通过在Altera的Quartus II FPGA开发软件下的试验表明,该编码器能够实时编码1920×1080 30fps格式的视频.  相似文献   

4.
针对DVB-C2标准,设计一种并行BCH编码器,并在Altera公司的EP3C55 FPGA上实现了该方案.实验结果表明,提出的并行编码器运算速度快,吞吐量大,具有一定的工程实用价值.  相似文献   

5.
并行哈夫曼编码器的硬件设计与实现   总被引:5,自引:4,他引:1  
文章设计了一种并行编码的哈夫曼硬件编码器,它采用了流水线和并行编码方法,使得在一个时钟周期内可以编码一个字节的数据,在编码时显著降低了工作频率。文章给出了关键部分的实现方案并分析了实验结果。  相似文献   

6.
本文研究了并行BCH和RS编码电路的通用设计方法和优化结构。针对信息位长度不能整除并行度的问题,采用在信息位前补零,可以不改变并行编码器结构的条件下解决了这个问题。  相似文献   

7.
高效的H.264并行编码算法   总被引:4,自引:1,他引:3       下载免费PDF全文
孙书为  陈书明 《电子学报》2009,37(2):357-361
 CABAC是H.264/AVC视频压缩标准主要档次中采用的熵编码机制,结合RDO模式选择技术,可以降低20%的编码码率,但是编码器计算复杂度却同时大大增加.对算法进行并行化是有效加快编码速度的方法,但是,由于CABAC具有自适应编码的特点和RDO模式选择对熵编码的使用,使得顺序编码的宏块之间存在着严格的数据相关性,限制了并行编码算法的开发.本文结合基于宏块区域划分的数据级并行编码机制MBRP和码率估计技术,为采用CABAC熵编码机制的H.264编码算法提供了一种高效的并行编码方案:将H.264编码算法划分为模式选择和码流生成两个部分,使之构成典型的生产者-消费者关系;将RDO模式选择中的CABAC替换为码率估计,去除模式选择过程中因CABAC导致的严格数据相关性;对模式选择部分采用MBRP并行机制;码流生成部分由单独的处理器完成,并和模式选择部分实现流水化并行处理.通过4处理器系统模拟器进行实验,发现在保持视频压缩性能几乎不变的情况下,该并行算法的加速比可以达到4.7.  相似文献   

8.
赵兴  沈海斌  阳晔 《电视技术》2005,(4):99-102
提出了一种新型的用于JPEG2000的、基于样本并行(sample-parallel)的低时钟数EBCOT编码器体系,有效降低了处理时间.体系包括1个每周期完成4个样本编码的位平面编码器,和3个每周期完成2位比特编码的内容自适应二进制算术编码器.其中的位平面编码器在码通并行(pass-parallel)的基础上进一步并行了每列4位样本的编码操作.二进制算术编码器通过采用流水线技术来匹配位平面编码器的高吞吐量.  相似文献   

9.
根据H.264这一新的视频压缩标准的特点,利用Intel的超线程技术以及OpenMp,可以使软件编码器进行线程级的并行运算,从而大大提高编码速度。本文对超线程技术、OpenMp以及编码过程中不同级别的并行运算进行了分析。  相似文献   

10.
RS码是一种纠错能力很强的线性分组码,可大幅提高通信系统性能。针对吉比特无源光网络,设计并实现了32位并行RS(255,239)编码器。编写VerilogHDL代码,利用QuartusII软件验证功能的正确性。结果表明该编码器运行速度快,占用资源少,满足GPON系统高速数据传输的要求。  相似文献   

11.
H.264/AVC标准中的CABAC应用研究   总被引:3,自引:0,他引:3  
H.264/AVC是由国际电信联盟(ITU)和国际标准化组织(ISO)共同制定的新一代视频编码标准。他的熵编码方案采纳了基于上下文的自适应二进制算术编码(CABAC)。CABAC是一种高效的熵编码,他利用上下文建模来降低符号间的冗余度,并且能够自适应码流的统计信息,获得很高的编码效率。深入研究了CABAC中的二进制化、上下文建模和自适应二进制算术编码器,并进行了相应的试验。实验结果表明:在相同的图像质量下,CABAC和CAVLC相比节省6%~15%的码率。  相似文献   

12.
This paper uses joint algorithm and architecture design to enable high coding efficiency in conjunction with high processing speed and low area cost. Specifically, it presents several optimizations that can be performed on Context Adaptive Binary Arithmetic Coding (CABAC), a form of entropy coding used in H.264/AVC, to achieve the throughput necessary for real-time low power high definition video coding. The combination of syntax element partitions and interleaved entropy slices, referred to as Massively Parallel CABAC, increases the number of binary symbols that can be processed in a cycle. Subinterval reordering is used to reduce the cycle time required to process each binary symbol. Under common conditions using the JM12.0 software, the Massively Parallel CABAC, increases the bins per cycle by 2.7 to 32.8× at a cost of 0.25 to 6.84% coding loss compared with sequential single slice H.264/AVC CABAC. It also provides a 2× reduction in area cost, and reduces memory bandwidth. Subinterval reordering reduces the critical path delay by 14 to 22%, while modifications to context selection reduces the memory requirement by 67%. This work demonstrates that accounting for implementation cost during video coding algorithms design can enable higher processing speed and reduce hardware cost, while still delivering high coding efficiency in the next generation video coding standard.  相似文献   

13.
CRC—16的一种快速编码器的实现   总被引:2,自引:0,他引:2  
本文从具有串行移位结构的循环检错编码器出发,通过分析,总结出了每次处理8bit的具有并行反馈移位寄存器结构的快速编码器。并计算出了联结矩阵,完成了两个用于软件编解码运算表,给出了软件编解码流程图。本文涉及的快选编码器已用于实际系统。  相似文献   

14.
H.264主要档次采用的CABAC熵编码技术在提高视频压缩比率的同时,严重增加了编/解码的计算复杂度,嵌入式系统由于其低成本低功耗的要求,需要专用硬件加速器来进行CABAC编/解码。设计了一个高性能H.264 CABAC硬件加速器,该加速器可配置为编码或解码模式,高效地实现CABAC编/解码操作。通过性能评估实验,在220 MHz时钟频率下,该加速器能够实现平均147 Mbps(1.5 cycle/bit)的编码速度和220 Mbps(1 cycle/bit)的解码速度。与软件实现相比,加速器获得50倍以上的性能提升。  相似文献   

15.
Low-Area/Power Parallel FIR Digital Filter Implementations   总被引:4,自引:0,他引:4  
This paper presents a novel approach for implementing area-efficient parallel (block) finite impulse response (FIR) filters that require less hardware than traditional block FIR filter implementations. Parallel processing is a powerful technique because it can be used to increase the throughput of a FIR filter or reduce the power consumption of a FIR filter. However, a traditional block filter implementation causes a linear increase in the hardware cost (area) by a factor of L, the block size. In many design situations, this large hardware penalty cannot be tolerated. Therefore, it is important to design parallel FIR filter structures that require less area than traditional block FIR filtering structures. In this paper, we propose a method to design parallel FIR filter structures that require a less-than-linear increase in the hardware cost. A novel adjacent coefficient sharing based sub-structure sharing technique is introduced and used to reduce the hardware cost of parallel FIR filters. A novel coefficient quantization technique, referred to as a scalable maximum absolute difference (MAD) quantization process, is introduced and used to produce quantized filters with good spectrum characteristics. By using a combination of fast FIR filtering algorithms, a novel coefficient quantization process and area reduction techniques, we show that parallel FIR filters can be implemented with up to a 45% reduction in hardware compared to traditional parallel FIR filters.  相似文献   

16.
A simple and adaptive lossless compression algorithm is proposed for remote sensing image compression, which includes integer wavelet transform and the Rice entropy coder. By analyzing the probability distribution of integer wavelet transform coefficients and the characteristics of Rice entropy coder, the divide and rule method is used for high-frequency sub-bands and low-frequency one. High-frequency sub-bands are coded by the Rice entropy coder, and low-frequency coefficients are predicted before coding. The role of predictor is to map the low-frequency coefficients into symbols suitable for the entropy coding. Experimental results show that the average Comprcssion Ratio (CR) of our approach is about two, which is close to that of JPEG 2000. The algorithm is simple and easy to be implemented in hardware. Moreover, it has the merits of adaptability, and independent data packet. So the algorithm can adapt to space lossless compression applications.  相似文献   

17.
提出了一种基于硬件逻辑实现的通用网络编码编解码算法。编码算法运用随机线性网络编码对数据分组进行编码,解码算法则运用克莱默法则进行解码。对编码器和解码器的算法和结构进行了详细的设计,并最终运用硬件描述语言在NetFPGA开发板上实现了该设计。测试结果表明,与传统的路由节点相比,使用线速的网络编码编解码器的网络能够达到最大流最小割定理所确定的流量极限,并且端到端的传输延迟稳定在一个很小的常数上。  相似文献   

18.
The embedded block coding with optimized truncation (EBCOT) algorithm is the heart of the JPEG 2000 image compression system. The MQ coder used in this algorithm restricts throughput of the EBCOT because there is very high correlation among all procedures to be performed in it. To overcome this obstacle, a high throughput MQ coder architecture is presented in this paper. To accomplish this, we have studied the number of rotations performed and the rate of byte emission in an image. This study reveals that in an image, on an average 75.03% and 22.72% of time one and two shifts occur, respectively. Similarly, about 5.5% of time two bytes are emitted concurrently. Based on these facts, a new MQ coder architecture is proposed which is capable of consuming one symbol per clock cycle. The throughput of this coder is improved by operating the renormalization and byte out stages concurrently. To reduce the hardware cost, synchronous shifters are used instead of hard shifters. The proposed architecture is implemented on Stratix FPGA and is capable of operating at 145.9 MHz. Memory requirement of the proposed architecture is reduced by a minimum of 66% compared to those of the other existing architectures. Relative figure of merit is computed to compare the overall efficiency of all architectures which show that the proposed architecture provides good balance between the throughput and hardware cost.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号