首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 760 毫秒
1.
应用于视频处理的可重构流处理器的设计与实现   总被引:1,自引:0,他引:1  
设计了一款新的应用于多媒体处理领域的可重构多媒体流处理器.该可重构多媒体流处理器采用并行处理机制,在经过算法映射后,可以充分利用多媒体算法的高并行度,同时实时处理不同的多媒体算法.该架构在Xilinx的Virtex4芯片上通过验证,并与ARM9处理器共同构成嵌入式多媒体处理平台,验证处理H.264和AVS的解码过程.  相似文献   

2.
可重构系统具有领域内灵活,性能和专用电路接近的优点,是视频解码的优秀硬件方案。然而在可重构系统上进行高清实时解码还有一定的难度,其中占80%计算量的主要是IDCT(反离散余弦变换)、MC(运动补偿)、Intra-prediction(帧内预测)、deblocking(去块效率滤波)等计算密集型任务。本文基于一款粗粒度可重构处理器,提出了上述计算密集型算法的映射方案,性能优于M.Ganesan与D.Peng在2007、2009年的方案,满足H.264高清实时解码的要求。  相似文献   

3.
文章介绍了一款新型可重构SoC电路,较详细地描述了它的内部结构和特点,并制定应用方案,分别重构SPI和DDS模块,对该电路进行验证.应用方案中,利用SPI与VS1003连接,通过该SPI接口控制并发送歌曲数据给VS1003,VS1003对数据进行解码处理,最后驱动功放播放歌曲.利用DDS模块产生信号数据,经过D/A转换...  相似文献   

4.
设计通用的宏块并行的H.264帧内解码次序,避免了解码时的数据冲突,进而设计了存储器及计算单元可复用的帧内预测宏块并行解码单元,在解码速度提高的同时,尽量避免了资源的开销.通过对设计的并行解码器速度的测试及DC综合的结果,验证了设计的可复用的宏块并行帧内解码器的VLSI结构有效性,每个宏块解码平均速度到达了113cycles.  相似文献   

5.
提出了一种可兼容多标准视频解码的顶层重用结构,以满足多标准视频解码芯片的低成本设计要求.从顶层解码结构、语法元素解析、参考帧管理、码流缓存区管理等方面进行多标准视频解码顶层重用机制的分析,并给出设计的具体实施方案.最后通过c model验证了设计方案的可行性.  相似文献   

6.
文章分析了主要分组密码算法操作特征以及处理结构的特点,结合可重构处理结构的设计方法,提出一种可重构密码处理结构.设计实现了基于可重构密码处理结构的验证原型.分析结果表明,在验证原型上执行的分组密码算法都可达到较高的性能.  相似文献   

7.
陆晓凤  刘锋  佟冬  王克义 《电子学报》2011,39(5):1072-1076
本文针对H.264 Fidelity Range Extensions(FRExt,High Profile)解码过程中扩展的所有变换,采用二维矩阵分解和基于矩阵运算提取公共因子的操作,利用通用运算单元来设计高效的可重构VLSI结构.该结构不但节省面积(可重构变换结构只消耗了4807门电路),并且具有高性能(采用TSM...  相似文献   

8.
在无反馈分布式视频编码系统中,提出了一种Wyner-Ziv帧的顽健重构算法。针对比特面解码错误带来的视频质量下降问题,对DC系数和AC系数使用不同重构方法,特别是对于解码失败的DC系数量化值,利用编码端原始图像的相关信息自适应地调整边信息量化值和解码失败量化值对重构的贡献,从而完成重构。实验结果表明,与最小均方误差重构算法相比,该算法可以有效提高解码视频的平均PSNR(peak signal-to-noise ratio),且解码视频图像的主观质量有明显改善。  相似文献   

9.
介绍了一种基于SystemC的可重构专用处理器核周期精确建模.该模型采用模块化设计,基于SystemC事务级建模,将运算功能和通信功能分开,模块之间的通信通过函数调用来实现.通过该模型,为可重构专用处理器核提供一种仿真验证平台,与传统RTL验证方法相比,大大提高了可重构专用处理器核的仿真验证效率.  相似文献   

10.
DTMB中NR码软解码的实现   总被引:1,自引:1,他引:0  
提出了一种有效的软输入软输出NR解码方法,可应用于地面数字电视广播传输标准(DTMB)中.其算法基于软输入的最大似然概率译码,使用最大相关码字和次最大相关码字产生输出软信息.经仿真验证,引入此方案的NR解码后,P8码率LDPC解码性能提高了约3.6 dB.  相似文献   

11.
作为计算量最多的模块之一,运动补偿占用了解码器与片外数据存储器之间约70%的带宽,是实现超高清视频解码的瓶颈。通过所设计的基于Cache的HEVC运动补偿模块,在保证实时解码数据吞吐量的同时,有效减少了80%的带宽消耗。首先,利用由可复用滤波器构成的插值计算模块和2D Cache设计了可并行化流水线数据处理的运动补偿模块,满足计算过程中高数据吞吐量需求。其次,设计高效内部存储器RAM结构,并提出片内Cache功耗降低的有效解决方案。最后,利用了参考帧数据相关性,设计插值顺序重排,将Cache的硬件开销减少了87.5%。基于HM9.0的HEVC标准测试视频序列实验结构表明,该设计显著地减少了带宽消耗和硬件开销。  相似文献   

12.
Input vector monitoring concurrent on-line BIST based on multilevel decoding logic is an attractive approach to reduce hardware overhead. In this paper, a novel optimization scheme is proposed for further reducing the hardware overhead of the decoding structure, which refers to improved decoding, input reduction, and simulated annealing inputs swapping approaches. Furthermore, utilizing similar multilevel decoding logic as the responses verifier, a novel cost-efficient input vector monitoring concurrent on-line BIST scheme is presented. The proposed scheme is applicable to the concurrent on-line testing for the CUT, the detail of which can not be obtained, such as hard IP cores. Experimental results indicate that the proposed optimization approaches can significantly reduce the hardware overhead of the decoding structure, and the proposed scheme costs lower hardware than other existing schemes.  相似文献   

13.
In this paper, we propose a method for speeding-up Digital Signal Processing applications by partitioning them between the reconfigurable hardware blocks of different granularity and mapping critical parts of applications on coarse-grain reconfigurable hardware. The reconfigurable hardware blocks are embedded in a heterogeneous reconfigurable system architecture. The fine-grain part is implemented by an embedded FPGA unit, while for the coarse-grain reconfigurable hardware our developed high-performance coarse-grain data-path is used. The design flow mainly consists of three steps; the analysis procedure, the mapping onto coarse-grain blocks, and the mapping onto the fine-grain hardware. In this work, the methodology is validated using five real-life applications; an OFDM transmitter, a medical imaging technique, a wavelet-based image compressor, a video compression scheme and a JPEG encoder. The experimental results show that the speedup, relative to an all-FPGA solution, ranges from 1.55 to 4.17 for the considered applications.  相似文献   

14.
Reconfigurable Hardware Architectures for Sequential and Hybrid Decoding   总被引:1,自引:0,他引:1  
A novel reconfigurable sequential decoder architecture based on the Fano algorithm is presented in which the constraint length, the threshold spacing, and the time-out threshold are all run time reconfigurable. To maximize decoding performance, a maximum possible backward depth (of a whole frame) is performed. This is achieved by using shift registers combined with memory to store the information of an entire visited path. A field-programmable gate array) prototype of the decoder is built and actual hardware decoding performances in terms of decoding speeds, bit error rates (BERs), and buffer overflow rates, are obtained and comparisons made. To overcome the decoding delay that is inherent in sequential decoders, a hybrid scheme, including simple block codes and cyclic redundancy check is proposed to limit the number of backward search operations that the sequential decoder has to execute. As a result, a significant reduction in decoding delay and buffer overflow rate is achieved while maintaining comparative decoding performance in terms of BER  相似文献   

15.
介绍了一种基于软件无线电平台的重构加载方法,通过研究可重构软件无线电硬件体系结构,FPGA可执行设备重构加载原理、协议及Davinci系列处理器高速并行外部存储器接口(EMIF),提出了一种基于DSP+ FPGA的重构加载方案,实现了FPGA设备驱动和重构加载软件设计.实验结果表明,软件无线电重构加载方案可高速、准确、可靠地完成波形文件重构加载及不同通信模式的无缝切换.  相似文献   

16.
Turbo码具有逼近Shannon容量限的优异性能,介绍了应用于深空通信的Turbo码编码方案和相应的译码算法,并给出了采用修正Max-Log-Map译码算法的深空CCSDS标准Turbo码的软件仿真性能和硬件系统实测性能。通过计算机仿真和硬件实测结果表明,采用该修正Max-Log-Map译码算法的Turbo码译码器易于硬件实现,同时Turbo码仿真性能和实际性能一致,适用于实际工程应用。  相似文献   

17.
Turbo乘积码是一类易于硬件实现高速迭代译码的分组码。对Turbo乘积码软输入软输出迭代译码算法进行了分析。将Turbo乘积码与QAM调制结合起来,提出了一种简化的、便于硬件实现的联合解调译码方案。仿真结果表明这种简化方案对译码性能影响很小。  相似文献   

18.
This paper presents novel very large scale integration (VLSI) architectures in support of an efficient implementation of Leighton's well-known Columnsort. The designs take advantage of reconfigurable bus architectures enhanced with simple shift switches. Our first main contribution is to show that Columnsort can be partitioned into two components: a hardware scheme involving the task of sorting arrays of small size and a hardware or software scheme that involves simple data movement tasks. Our second main contribution is to demonstrate that the dynamically reconfigurable mesh architecture can be exploited to obtain a small and efficient hardware sorter. The resulting architectures feature high regularity of circuitry, simplicity of control structure, and adaptability. Both theoretical analyses and simulation tests have shown that the proposed VLSI architectures for sorting are superior to existing designs in the context of sorting small and moderate size arrays  相似文献   

19.
王虹  陈锴 《信息技术》2005,29(7):29-31
介绍了ITU-G.729语音压缩标准的编、解码原理,提出了一种基于DSP的软、硬件设计方案,并着重讨论了在实现过程中的几项关键技术。  相似文献   

20.
提出一种红外解码IP核在SoPC系统中的设计与实现方案,重点研究红外系统的数据编码和传输机制,红外解码电路的HDL设计,IP核的制作及在SoPC系统中的应用。该方案的红外发送接收芯片分别是TC9012和DS338S,在DE2开发板对IP核进行测试。结果表明,红外解码IP能顺利地添加到SoPC系统中,实现快速、稳定、正确的红外解码功能,达到预期设计目标。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号