共查询到20条相似文献,搜索用时 109 毫秒
1.
H.263编码中DCT在定点DSP上的实现 总被引:3,自引:0,他引:3
简要介绍了H.263编码标准及H.263编码采用的正交变换编码离散余弦变换。文中着重讨论了DCT算法的定点化,并根据TMS320C6201DSP的特点对IDCT的算法进行了改进。最后采用DSP汇编语言实现DCT快速算法。 相似文献
2.
3.
4.
文章介绍了ITU-TH.264编码算法原理和TM1300定点DSP芯片。针对该芯片的硬件结构特点,设计了一套运行于TM1300之上的实时视频信号采集、视频编码、视频输出系统的可行方案。讨论了H.264实时视频编码器在TM1300上定点实现的关键技术和难点问题,详细论述了H.264编码算法的代码优化技术。 相似文献
5.
实时H.263+视频编码器的DSP实现 总被引:3,自引:0,他引:3
在多媒体处理芯片TM-1300的开发平台上,快速实现了H.263 视频编码器。首先,根据H.263 编码算法要求,介绍TM-1300适合视频通信开发的特点及其开发环境;然后,将编码算法移植到TM-1300平台并进行大量的优化,其中,重点讨论了位移估值的优化算法。由实验结果可知,使用本文给出的优化算法,可以在TM-1300上快速实现H.263 编码器,满足视频实时编解码的要求,且已应用于实际视频通信产品中。 相似文献
6.
7.
本文主要介绍了用TMS320C6201实现低比特率视频压缩编码标准H.263的编译码系统。本文对H.263实现算法进行了研究,并针对DSP的结构特点,提出了多种改进措施以适应实时图像编译码与误码信道传输的要求,本文还给出硬件设计与软件的优化处理方法。 相似文献
8.
9.
文章在TMS320C6204定点DSP芯片上实现了MPEG-4像素压缩模块的优化.重点讨论了一种快速的DCT/IDCT算法在DSP上的实现,并针对其中最耗时的DCT/IDCT、量化/反量化算法做了软件优化,有效的降低了整个模块的运行时钟数.实验结果表明本文的算法和优化结果都取得了良好的效果. 相似文献
10.
采用仅有移位和加法操作的提升步长的双正交DCT变换来实现H.263视频压缩编码中的DCT变换。双正交DCT能快速有效地实现整数到整数的变换,在C6201定点DSP中,双正交DCT的编码速度与DCT相比有很大提高。 相似文献
11.
12.
Galli R. Tenca A.F. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(1):52-66
The use of online arithmetic was often proposed for hardware implementations of complex digital-signal processing (DSP) algorithms. However, several important issues in the design process of such algorithms using online arithmetic are rarely discussed in the literature. This paper presents these issues and provides a methodology to analyze the behavior of networks of online arithmetic modules performing serial computation over fixed-point numbers. The methodology is presented, applied in several examples, and finally used to design an efficient field programmable gate arrays implementation of the Levinson-Durbin algorithm in an application of the Yule-Walker power spectrum estimation. The methodology can be applied to other algorithms as well and it simplifies the task of designing and verifying a network of online modules. The experimental results show the advantages of online arithmetic in the design of complex DSP algorithms. 相似文献
13.
14.
The fast QR-decomposition based recursive least-squares (FQRD-RLS) algorithms offer RLS-like convergence and misadjustment at a lower computational cost, and therefore are desirable for implementation on a fixed-point digital signal processor (DSP). Furthermore, the FQRD-RLS algorithms are derived from QR-decomposition based RLS algorithms that are well known for their numerical stability in finite precision. Hence, these algorithms are often assumed numerically stable, although there is no rigorous analysis addressing the stability of such algorithms in finite precision. In this paper, we derive the conditions that guarantee stability of the FQRD-RLS algorithms, and also derive mathematical expressions for the mean-squared quantization error (MSQE) of internal variables of the FQRD-RLS algorithms at steady state. The objective is to quantify the propagation error due to quantization effects. The derived MSQE expressions have been verified by comparisons with fixed-point computer simulations. 相似文献
15.
16.
A 32-b RISC/DSP microprocessor with reduced complexity 总被引:2,自引:0,他引:2
Dolle M. Jhand S. Lehner W. Muller O. Schlett M. 《Solid-State Circuits, IEEE Journal of》1997,32(7):1056-1066
This paper presents a new 32-b reduced instruction set computer/digital signal processor (RISC/DSP) architecture which can be used as a general purpose microprocessor and in parallel as a 16-/32-b fixed-point DSP. This has been achieved by using RISC design principles for the implementation of DSP functionality. A DSP unit operates in parallel to an arithmetic logic unit (ALU)/barrelshifter on the same register set. This architecture provides the fast loop processing, high data throughput, and deterministic program flow absolutely necessary in DSP applications. Besides offering a basis for general purpose and DSP processing, the RISC philosophy offers a higher degree of flexibility for the implementation of DSP algorithms and achieves higher clock frequencies compared to conventional DSP architectures. The integrated DSP unit provides instruction set support for highly specialized DSP algorithms. Subword processing optimized for DSP algorithms has been implemented to provide maximum performance for 16-b data types. While creating a unified base for both application areas, we also minimized transistor count and we reduced complexity by using a short instruction pipeline. A parallelism concept based on a varying number of instruction latency cycles made superscalar instruction execution superfluous 相似文献
17.
18.
19.
SoC芯片中基于统计分析的浮点到定点转换方法 总被引:2,自引:0,他引:2
在通信、语音、图像处理等数字信号处理应用系统中一般使用浮点算法.为降低硬件成本、功耗,在定点硬件架构上实现浮点算法成为一种有效的解决方案.在定点SoC(System on Chip)芯片中,为达到性能、成本、功耗的平衡,常采用定点近似算法和硬件加速方案对浮点数字信号处理算法进行转换和优化.因此,需要在制造费用、功耗、性能等诸多限制下,将浮点算法转换成定点数近似算法.本文提出了一种基于定点SoC芯片的浮点到定点转换方法.首先,本文引入硬件加速模块参数和转换参数完成浮点算法到定点算法的转换,然后使用本文提出的r通过信噪比对定点数近似算法进行评估的方法,在满足一定信噪比限制条件下,计算出最佳硬件加速模块参数和转换参数,从而得到基于硬件加速的最优定点算法.同时,在此方法基础上进一步研究了单核SoC芯片内置硬件加速模块的原型开发策略. 相似文献