共查询到20条相似文献,搜索用时 187 毫秒
1.
美国视讯科技有限公司 《今日电子》2001,(7):30-32
1.概述 美国视讯科技有限公司(Stream Machine)公司所推出的低成本,高性能,单芯片的MPEG-2音频视频编解码产品,是由一个RISC(精简指令集运算)芯核,一个24位DSP(数字信号处理器),视频音频接口单元及多个专用处理单元组成。该产品的可编程视频接口单元对多模式的前,后处理及OSD(屏幕显示)有着强大的支持功能。其CODEC(多媒体数字信号编解码器)更是采用了0.18微米CMOS工艺技术的标准信元库。 相似文献
2.
3.
在实际的高性能定点数字信号处理器(DSP)设计过程中,往往需要设计一个功能复杂的乘累加器。也就是说,乘累加器不光是要同时完成通常所见的带符号数和无符号数的乘加及乘减运算,而且还需要同时完成整数乘加和小数乘加运算,无偏差的舍入运算,饱和等功能。另外,为了解决DSP中数据相关的问题,往往要求乘累加器在单拍完成所有的这些运算,因此很难找到一个高速度低成本的实现方案。文章首先给出了通常的高性能定点DSP中乘累加器所需要完成的功能需求,然后提出并实现了一个16位高性能乘累加器,将其所需要完成的上述各种功能巧妙地整合起来在单拍内完成,而完成所有上述功能只需要3级4:2压缩和一次超前进位的加法运算。该乘累加器采用0.35μm工艺实现,已经嵌入到数字信号处理器中并已经成功应用于实际的工程项目。 相似文献
4.
随着数字信号处理技术的发展,开发应用高速DSP处理器芯片进一步提高运算处理速度是主要的发展方向。介绍了用DSP56001数字信号处理器芯片以及位反转算法实现24位定点字长1024点基2DITFFT的技术细节,经实测可在5.6ms内完成,这一处理速度在同类DSP处理器中是令人瞩目的。 相似文献
5.
第四届“Motorola”杯嵌入式处理器设计应用大奖赛已胜利闭幕(详见2002年12月A期)。本次大赛CPU选择范围除前三届规定的MCU(单片机)外,新增加了Motorola公司16位DSP56800系列数字信号处理器(DSP),使本次大赛更具时代性、广泛性和挑战性。DSP56F805特点在本次大赛DSP组获奖的嵌入式系统设计项目中,其处理器均选自Motorola公司16位DSP56800系列中的DSP56F805 和DSP56F826两种芯片。Motorola公司的DSP56800系列数字信号处理器(DSP)将DSP与MCU集成在一起。该系列芯片并行指令集控制三级流水作业中的三个执行单元:数据AL… 相似文献
6.
7.
定点DSP中运算精度的提高 总被引:7,自引:0,他引:7
通过实际范例和相应的TMS320C54x汇编程序分析了如何在定点数字信号处理器(DSP)中提高运算精度的方法,并介绍了定点DSP中的数据格式。 相似文献
8.
9.
设计一种基于数字信号处理器(DSP)的地震波数据采集仪.该仪器以TMS320VC5410A为系统运算处理器,负责针对地震波信号的数学分析.本文简单介绍了VectorSeis数字三分量检波器的优越性,给出了DSP主要外围电路(包括复位电路、电源模块、存储器控制电路)的实现,重点介绍了DSP与PC之间基于USB接口通信的软... 相似文献
10.
低成本、高性能和高密度的DC-AC 逆变器是不间断电源、燃料电池、太阳能和风力发电系统的关键部件。采用 DSP(数字信号处理器)能有效地降低逆变器成本。功能强大的16位定点 DSP 包 相似文献
11.
This paper presents an Application-Specific Signal Processor (ASSP) for Orthogonal Frequency Division Multiplexing (OFDM)
Communication Systems, called SPOCS. The instruction set and its architecture are specially designed for OFDM systems, such
as Fast Fourier Transform (FFT), scrambling/descrambling, puncturing, convolutional encoding, interleaving/deinterleaving,
etc. SPOCS employs the optimized Data Processing Unit (DPU) to support the proposed instructions and the FFT Address Generation
Unit (FAGU) to automatically calculate input/output data addresses. In addition, the proposed Bit Manipulation Unit (BMU)
supports efficient bit manipulation operations. SPOCS has been synthesized using the SEC 0.18 μm standard cell library and
has a much smaller area than commercial DSP chips. SPOCS can reduce the number of clock cycles over 8%~53% for FFT and about
48%~84% for scrambling, convolutional encoding and interleaving compared with existing DSP chips. SPOCS can support various
OFDM communication standards, such as Wireless Local Area Network (WLAN), Digital Audio Broadcasting (DAB), Digital Video
Broadcasting-Terrestrial (DVB-T), etc.
相似文献
Myung H. SunwooEmail: |
12.
Galli R. Tenca A.F. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(1):52-66
The use of online arithmetic was often proposed for hardware implementations of complex digital-signal processing (DSP) algorithms. However, several important issues in the design process of such algorithms using online arithmetic are rarely discussed in the literature. This paper presents these issues and provides a methodology to analyze the behavior of networks of online arithmetic modules performing serial computation over fixed-point numbers. The methodology is presented, applied in several examples, and finally used to design an efficient field programmable gate arrays implementation of the Levinson-Durbin algorithm in an application of the Yule-Walker power spectrum estimation. The methodology can be applied to other algorithms as well and it simplifies the task of designing and verifying a network of online modules. The experimental results show the advantages of online arithmetic in the design of complex DSP algorithms. 相似文献
13.
提出一种浮点型数字信号处理器(DSP)硬核结构,在兼容定点数运算的同时,也为浮点数运算提供较好支持。目前各大现场可编程门阵列(FPGA)主流厂商在实现浮点数运算功能时均采用软核实现方式,即将浮点数运算算法映射到芯片上,通过逻辑资源和DSP模块实现。相比于传统方法,提出的硬核结构在不占用FPGA中其他逻辑资源情况下,仅利用DSP模块便能完成浮点数运算。设计中,充分考虑负载和时延影响,插入多级流水线,显著提高浮点数的计算效率。采用中芯国际(MCI)28 nm工艺设计并完成所提出的浮点型DSP硬核结构。仿真结果表明,所提出的硬核结构的单个浮点数加法和乘法效率为0.4 Gflops。 相似文献
14.
Nomura M. Yamashina M. Goto J. Inoue T. Suzuki K. Motomura M. Koseki Y. Shih B.S. Horiuchi T. Hamatake N. Kumagai K. Enomoto T. Yamada H. 《Solid-State Circuits, IEEE Journal of》1994,29(3):290-297
A 300-MHz 16-b fixed-point digital signal processor (DSP) core LSI has been developed for video signal processing. In order to achieve high performance, the DSP core LSI employs a parallel processing architecture, 300-MHz redundant binary arithmetic units, and a sophisticated high-performance electrical design. The DSP core LSI, which was fabricated with 0.5-μm BICMOS and triple-level-metallization technology, has a 3.9 mm×4.6 mm area, and contains about 57K transistors. It consumes 2 W at a 300-MHz clock frequency with a 3.3-V power supply. Measured clock skew and critical path delay are less than 80 ps and 2.6 ns, respectively 相似文献
15.
文章通过对32位定点DSP的体系结构及其设计方法的研究,重点阐述了32位定点DSP中CPU包括ALU、MPY、ARAU、流水线、指令系统和总线接口等关键逻辑部件工作原理,对各个逻辑部件的设计思路和实现方法进行了分析描述。采用基于标准单元正向设计方法,设计了一款32位指令集的定点DSP电路,该电路采用哈佛总线结构,可以在单周期内实现16×16位有符号整数乘法、32位累加和32位数据的算术逻辑运算,处理精度高。该电路采用0.5μm 1P3M CMOS工艺流片,集成度7万门,工作频率可达36 MHz,动态功耗594 mW。 相似文献
16.
A 32-b RISC/DSP microprocessor with reduced complexity 总被引:2,自引:0,他引:2
Dolle M. Jhand S. Lehner W. Muller O. Schlett M. 《Solid-State Circuits, IEEE Journal of》1997,32(7):1056-1066
This paper presents a new 32-b reduced instruction set computer/digital signal processor (RISC/DSP) architecture which can be used as a general purpose microprocessor and in parallel as a 16-/32-b fixed-point DSP. This has been achieved by using RISC design principles for the implementation of DSP functionality. A DSP unit operates in parallel to an arithmetic logic unit (ALU)/barrelshifter on the same register set. This architecture provides the fast loop processing, high data throughput, and deterministic program flow absolutely necessary in DSP applications. Besides offering a basis for general purpose and DSP processing, the RISC philosophy offers a higher degree of flexibility for the implementation of DSP algorithms and achieves higher clock frequencies compared to conventional DSP architectures. The integrated DSP unit provides instruction set support for highly specialized DSP algorithms. Subword processing optimized for DSP algorithms has been implemented to provide maximum performance for 16-b data types. While creating a unified base for both application areas, we also minimized transistor count and we reduced complexity by using a short instruction pipeline. A parallelism concept based on a varying number of instruction latency cycles made superscalar instruction execution superfluous 相似文献
17.
18.
The computation of square roots is required in signal processing applications, such as adaptive filtering using transversal filters or lattice filters, spectral estimation, and many other fields of engineering sciences. Actually, all the existing digital signal processors (DSP) have a multiplier-accumulator. We present a simple binary algorithm for square-rooting using a processor with multiplier. Only shifts, additions, and multiplications are used and unlike the Newton-Raphson approach, divisions are not necessary. The method can also be interesting for the computation of divisions. The algorithm has been implemented in 16-bit fixed-point arithmetic on a TMS32010 DSP processor. The computational requirements are compared with the Newton-Raphson method. The fixed-point code of the algorithm written in TMS32010 Assembly language is also given. 相似文献
19.
This article describes some of our recent work in the development of computer architectures for efficient execution of artificial neural network algorithms. Our earlier system, the Ring Array Processor (RAP), was a multiprocessor based on commercial DSPs with a low-latency ring interconnection scheme. We have used the RAP to simulate variable precision arithmetic to guide us in the design of arithmetic units for high performance neurocomputers to be implemented with custom VLSI. The RAP system played a critical role in this study, enabling us to experiment with much larger networks than would otherwise be possible. Our study shows that back-propagation training algorithms only require moderate precision. Specifically, 16b weight values and 8b output values are sufficient to achieve training and classification results comparable to 32b floating point. Although these results were gathered for frame classification in continuous speech, we expect that they will extend to many other connectionist calculations. We have used these results as part of the design of a programmable single chip microprocessor, SPERT. The reduced precision arithmetic permits the use of multiple arithmetic units per processor. Also, reduced precision operands make more efficient use of valuable processor-memory bandwidth. For our moderate-precision fixed-point arithmetic applications, SPERT represents more than an order of magnitude reduction in cost over systems with equivalent performance that use commercial DSP chips. 相似文献