期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Compact DSP Core with Static Floating-Point Arithmetic

Tay-Jyi Lin Hung-Yueh Lin Chie-Min Chao Chih-Wei Liu Chih-Wei Jen 《The Journal of VLSI Signal Processing》2006,42(2):127-138

A multimedia system-on-a-chip (SoC) usually contains one or more programmable digital signal processors (DSP) to accelerate data-intensive computations. But most of these DSP cores are designed originally for standalone applications, and they must have some overlapped (and redundant) components with the host microprocessor. This paper presents a compact DSP for multi-core systems, which is fully programmable and has been optimized to execute a set of signal processing kernels very efficiently. The DSP core was designed concurrently with its automatic software generator based on high-level synthesis. Moreover, it performs lightweight arithmetic—the static floating-point (SFP), which approximates the quality of floating-point (FP) operations with the hardware similar to that of the integer arithmetic. In our simulations, the compact DSP and its auto-generated software can achieve 3X performance (estimated in cycles) of those DSP cores in the dual-core baseband processors with similar computing resources. Besides, the 16-bit SFP has above 40 dB signal to round-off noise ratio over the IEEE single-precision FP, and it even outperforms the hand-optimized programs based on the 32-bit integer arithmetic. The 24-bit SFP has above 64 dB quality, of which the maximum precision is identical to that of the single-precision FP. Finally, the DSP core has been implemented and fabricated in the UMC 0.18μm 1P6M CMOS technology. It can operate at 314.5 MHz while consuming 52mW average power. The core size is only 1.5 mm×1.5 mm including the 16 KB on-chip memory and the AMBA AHB interface. This work was supported by the National Science Council, Taiwan under Grant NSC93-2220-E-009-017. Besides, the authors would like to thank the National Chip Implementation Center (CIC) for chip fabrication. Tay-Jyi Lin received the BS degree in electrical and control engineering from National Chiao Tung University, Taiwan, in 1998. He is working toward the PhD degree in the Department of Electronics Engineering and the Institute of Electronics, National Chiao Tung University. His current researches include the heterogeneous computing platform for embedded multimedia systems, complexity-aware architecture design, and high-performance/low-power digital signal processors. Hung-Yueh Lin received the BS and the MS degrees in electronics engineering from National Chiao Tung University, Taiwan, in 2002 and 2004, respectively. He is now with MediaTek, Inc., Hsinchu, Taiwan. His research interests include lightweight computer arithmetic and DSP architecture. Chie-Min Chao received the BS degree in electronics engineering from National Chiao Tung University, Taiwan, in 2003, where he is currently pursuing his MS degree. His researches include system software development, VLSI system design, and DSP architecture. Chih-Wei Liu received the BS and the PhD degrees in electrical engineering from National Tsing Hua University, Taiwan, in 1991 and 1999, respectively. From 1999 to 2000, he was an integrated circuit design engineer at the Electronics Research and Service Organization (ERSO) of Industrial Technology Research Institute (ITRI), Taiwan. Then, near the end of 2000, he started to work for the SoC Technology Center (STC) of ITRI as a project leader and eventually left ITRI at the end of Oct., 2003. He is currently with the Department of Electronics Engineering and the Institute of Electronics, National Chiao Tung University, Taiwan, as an assistant professor. His current research interests include SoC and VLSI system design, processor architecture, digital signal processing, digital communications, and coding theory. Chein-Wei Jen received the BS degree from National Chiao Tung University, Taiwan, in 1970, the MS degree from Stanford University in 1977, and the PhD degree from National Chiao Tung University in 1983. From 1981 to 2004, he was with the Department of Electronics Engineering and the Institute of Electronics at National Chiao Tung University. Dr Jen was given the Outstanding Electrical Engineering Professor Award by the Chinese Institute of Electrical Engineering in 2002. He is currently the General Director of the SoC Technology Center at Industrial Technology Research Institute, the Adviser of National SoC Program, and the Managing Director of the Board of the Taiwan IC Design Society. His research interests include SoC design, VLSI architectures, multimedia processing, and design automation. He holds seven patents and has published over 50 journal and 100 conference papers in these areas. 相似文献

2.

Pipelined Scheduling of Functional HW/SW Modules for Platform‐Based SoC Design

Wonjong Kim June‐Young Chang Hanjin Cho 《ETRI Journal》2005,27(5):533-538

We developed a pipelined scheduling technique of functional hardware and software modules for platform‐based system‐on‐a‐chip (SoC) designs. It is based on a modified list scheduling algorithm. We used the pipelined scheduling technique for a performance analysis of an MPEG4 video encoder application. Then, we applied it for architecture exploration to achieve a better performance. In our experiments, the modified SoC platform with 6 pipelines for the 32‐bit dual layer architecture shows a 118% improvement in performance compared to the given basic SoC platform with 4 pipelines for the 16‐bit single‐layer architecture. 相似文献

3.

Performance Analysis for MPEG‐4 Video Codec Based on On‐Chip Network

June‐Young Chang Won‐Jong Kim Young‐Hwan Bae Jin Ho Han Han‐Jin Cho Hee‐Bum Jung 《ETRI Journal》2005,27(5):497-503

In this paper, we present a performance analysis for an MPEG‐4 video codec based on the on‐chip network communication architecture. The existing on‐chip buses of system‐on‐a‐chip (SoC) have some limitation on data traffic bandwidth since a large number of silicon IPs share the bus. An on‐chip network is introduced to solve the problem of on‐chip buses, in which the concept of a computer network is applied to the communication architecture of SoC. We compared the performance of the MPEG‐4 video codec based on the on‐chip network and Advanced Micro‐controller Bus Architecture (AMBA) on‐chip bus. Experimental results show that the performance of the MPEG‐4 video codec based on the on‐chip network is improved over 50% compared to the design based on a multi‐layer AMBA bus. 相似文献

4.

Reconfigurable Filter Coprocessor Architecture for DSP Applications 总被引：1，自引：0，他引：1

S. Ramanathan S.K. Nandy V. Visvanathan 《The Journal of VLSI Signal Processing》2000,26(3):333-359

Digital Signal Processing (DSP) is widely used in high-performance media processing and communication systems. In majority of these applications, critical DSP functions are realized as embedded cores to meet the low-power budget and high computational complexity. Usually these cores are ASICs that cannot be easily retargeted for other similar applications that share certain commonalities. This stretches the design cycle that affects time-to-market constraints. In this paper, we present a reconfigurable high-performance low-power filter coprocessor architecture for DSP applications. The coprocessor architecture, apart from having the performance and power advantage of its ASIC counterpart, can be reconfigured to support a wide variety of filtering computations. Since filtering computations abound in DSP applications, the implementation of this coprocessor architecture can serve as an important embedded hardware IP. 相似文献

5.

IIR数字滤波器在定点DSP中可实现性的仿真分析

刘舒帆张小虹任姝婕《现代电子技术》2008,31(19)

在使用定点DSP芯片实现IIR数字滤波器的工程实践中,对于给定的滤波器设计指标,当取样频率提高时,在Matlab辅助设计中能够实现的IIR滤波器,在TMS320C54x上却特性改变,甚至无法使用。对此进行了较深入的仿真分析和研究,找出了在定点DSP上实现IIR数字滤波器时,滤波器系数取值的约束条件,摸索了解决问题的基本途径。相似文献

6.

考虑异种帧相关性的VBR MPEG视频源混合回归模型

李骄阳惠晓实徐海峰刘贤德《通信学报》1999,20(1):1-7

ＭＰＥＧ作为一种国际动态图像压缩标准,其越来越广泛。为在通信网络上有效地传输和控制ＭＰＥＧ可变经特率（ＶＢＲ）数据流,第一个关键问题就是如何建立它的统计模型,目前已有的视频源模型都没有考虑不同类型帧之间的相关性,并不能很好地模拟ＶＢＲＭＰＥＧ视频源。本文首次提出了一咋处世划种帧相关性的ＶＢＲＭＰＥＧ视频源统计模型,混合回归（ＣＲ）模型,及其参数估计算法,实验结果表明,和传统的自回归（ＡＲ）模型相比相似文献

7.

A Low-Power Heterogeneous Multiprocessor Architecture for Audio Signal Processing

Özgün Paker Jens Sparsø Niels Haandbæk Mogens Isager Lars Skovby Nielsen 《The Journal of VLSI Signal Processing》2004,37(1):95-110

This paper describes a low-power programmable DSP architecture that targets audio signal processing. The architecture can be characterized as a heterogeneous multiprocessor consisting of small instruction set processors called mini-cores as well as standard DSP and CPU cores that communicate using message passing. The mini-cores are tailored for different classes of filtering algorithms (FIR, IIR, N-LMS etc.), and in a typical system the communication among processors occur at the sampling rate only.The mini-cores are intended as soft-macros to be used in the implementation of system-on-chip solutions using a synthesis-based design flow targeting a standard-cell implementation. They are parameterized in word-size, memory-size, etc. and can be instantiated according to the needs of the application. To give an impression of the size of a mini-core we mention that one of the FIR mini-cores in a prototype design has 16 instructions, a 32-word × 16-bit program memory, a 64-word × 16-bit data memory and a 25-word × 16-bit coefficient memory.Results obtained from the design of a prototype chip containing mini-cores for a hearing aid application, demonstrate a power consumption that is only 1.5–1.6 times larger than a hardwired ASIC and more than 6–21 times lower than current state of the art low-power DSP processors. This is due to: (1) the small size of the processors and (2) a smaller instruction count for a given task. 相似文献

8.

基于DSP Builder和FPGA的IIR滤波器设计

杨世华王秀敏陈豪威《通信技术》2010,43(12):184-186

针对传统的基于现场可编程门阵列（FPGA）的数字滤波器设计所需周期长,提出了基于dsp builder和FPGA的滤波器设计,完全实现自顶向下的设计流程。在此基础上设计实现四节级联IIR,并结合MATLAB强大计算功能,提出了利用MATLAB和Quartus II联合仿真算法;使输出复杂的数据变为波形,易于观察仿真结果,增强了Quartus的仿真功能。结果表明设计的IIR滤波器完全达到设计要求。相似文献

9.

达芬奇异构多核处理器核间通信技术研究

国常义李超群刘峰《电视技术》2015,39(7)

针对目前通用的达芬奇异构多核处理器,研究了其ARM核、DSP核以及视频协处理器之间的通信与协作机制.在分析多核处理器核间通信原理的基础上,研究了TMS320DM816x系列达芬奇异构多核处理器的核间通信技术,详细阐述片上核间互联结构与核间通信软件的实现.最后基于SysLink底层通信模块设计了多路高清音视频应用系统,对核间通信进行验证.系统可充分发挥各处理核的性能,实现了各核间的高效协作. 相似文献

10.

Optical half-band filters 总被引：4，自引：0，他引：4

Jinguji K. Oguma M. 《Lightwave Technology, Journal of》2000,18(2):252-259

This paper proposes two kinds of novel 2×2 circuit configuration for finite-impulse response (FIR) half-band filters. These configurations can be transformed into each other by a symmetric transformation and their power transmittance is identical. The configurations have only about half the elements of conventional FIR lattice-form filters. We derive a design algorithm for achieving desired power transmittance spectra. We also describe 2×2 circuit configurations for infinite-impulse response (IIR) half-band filters. These configurations are designed to realize arbitrary-order IIR half-band filter characteristics by extending the conventional half-band circuit configuration used in millimeter-wave devices. We discuss their filter characteristics and confirm that they have a power half-band property. We demonstrate design examples including FIR maximally flat half-band filters, an FIR Chebyshev half-band filter, and an IIR elliptic half-band filter 相似文献

11.

基于DSP的低码率实时视频编码器设计与实现 总被引：1，自引：0，他引：1

刘少华熊志辉包卫东张茂军《电子与信息学报》2008,30(4):945-948

该文以TI公司TMS320DM642 DSP为核心处理器设计实现了一个符合MPEG标准的低码率实时视频编码器。主要特色是:提出并实现了中心三步搜索和菱形搜索相结合的方法进行快速运动搜索;提出并实现了一种新的全零块预先判别方法;针对DSP系统结构以及指令特点对编码过程中的运算密集部分进行专门优化。实验表明,该文提出的快速运动搜索算法性能优于中心三步搜索算法。全零块预先判别机制在保证图像质量的同时能有效减小运算量并降低码率。相似文献

12.

A Platform‐Based SoC Design of a 32‐Bit Smart Card

Wonjong Kim Seungchul Kim Younghwan Bae Sungik Jun Youngsoo Park Hanjin Cho 《ETRI Journal》2003,25(6):510-516

In this paper, we describe the development of a platform‐based SoC of a 32‐bit smart card. The smart card uses a 32‐bit microprocessor for high performance and two cryptographic processors for high security. It supports both contact and contactless interfaces, which comply with ISO/IEC 7816 and 14496 Type B. It has a Java Card OS to support multiple applications. We modeled smart card readers with a foreign language interface for efficient verification of the smart card SoC. The SoC was implemented using 0.25 µm technology. To reduce the power consumption of the smart card SoC, we applied power optimization techniques, including clock gating. Experimental results show that the power consumption of the RSA and ECC cryptographic processors can be reduced by 32% and 62%, respectively, without increasing the area. 相似文献

13.

IIR数字滤波器的优化设计和DSP实现 总被引：3，自引：0，他引：3

张晓光徐钊《电子工程师》2006,32(3):37-39

首先叙述了直接Ⅱ型IIR(无限冲击响应)数字滤波器能够克服使用定点DSP实现IIR数字滤波器时引起的输入数据的溢出问题;然后利用MATLAB软件生成滤波器的输入数据和系数,进行相应的数据压缩处理,并生成仿真波形;最后给出了用DSP语言实现直接Ⅱ型结构IIR数字滤波器的完整程序、仿真结果,同时对仿真结果进行了分析、比较。相似文献

14.

Parallel Architecture Core (PAC)—the First Multicore Application Processor SoC in Taiwan Part II: Application Programming

Jia-Ming Chen Chun-Nan Liu Jen-Kuei Yang Shau-Yin Tseng Wei-Kuan Shih An-Yeu Wu 《Journal of Signal Processing Systems》2011,62(3):383-402

Two representative multimedia applications—AAC and H.264/AVC decoders on the parallel architecture core (PAC) SoC are introduced in the second part of the two introductory papers. The applications have been programmed on the PACDSP core and the PAC SoC to demonstrate the high-performance, low-power DSP computations and the effectiveness of the dynamic voltage and frequency scaling (DVFS) capability on the heterogeneous multicore SoC. First, techniques to exploit data- and instruction-level parallelisms existing in the application kernels are described for performance optimizations on the clustered VLIW architecture of PACDSP with the distributed register organization. Next, two variation techniques of asymmetric programming model are introduced by examples of decoders. Then, the energy efficiency of the programmable multimedia SoC is demonstrated using an innovative power-aware H.264/AVC decoder. Finally, a DVFS-aware framework for soft real-time video playback is provided by extending the power-aware decoding scheme. The work provides practical references of realizing multimedia applications on PAC SoC suitable for rich-function and resource constraint portable devices. 相似文献

15.

FPGA-based digit-serial CSD FIR filter for image signal format conversion

《Microelectronics Journal》2002,33(5-6):501-508

This paper proposes the FPGA implementation of the digit-serial Canonical Signed-Digit (CSD) coefficient FIR filters which can be used as format conversion filters in place of the ones employed for the MPEG2 TM 5 (test model 5). Canonical representation of a signed digit (CSD) is a method used to reduce cost by representing a signed number using the least amount of non-zero digits, thereby reducing the number of multiply operations. As Field Programmable Gate Arrays (FPGAs) have grown in capacity, improved in performance, and decreased in cost, they are becoming a viable solution for performing computationally intensive tasks, with the ability to tackle applications formerly reserved for custom chips and programmable digital signal processing (DSP) devices. A digit-serial CSD FIR filter design is realized and practical design guidelines are provided using FPGAs. An analysis of the performance comparison of bit-serial, serial distributed arithmetic, and digit-serial CSD FIR filters on a Xilinx XC4000XL-series FPGA is described. The results show that the proposed digit-serial CSD FIR filter is compact and an efficient implementation of real-time DSP applications on FPGAs. 相似文献

16.

Reconfigurable Computing for Digital Signal Processing: A Survey 总被引：6，自引：0，他引：6

Russell Tessier Wayne Burleson 《The Journal of VLSI Signal Processing》2001,28(1-2):7-27

Steady advances in VLSI technology and design tools have extensively expanded the application domain of digital signal processing over the past decade. While application-specific integrated circuits (ASICs) and programmable digital signal processors (PDSPs) remain the implementation mechanisms of choice for many DSP applications, increasingly new system implementations based on reconfigurable computing are being considered. These flexible platforms, which offer the functional efficiency of hardware and the programmability of software, are quickly maturing as the logic capacity of programmable devices follows Moore's Law and advanced automated design techniques become available. As initial reconfigurable technologies have emerged, new academic and commercial efforts have been initiated to support power optimization, cost reduction, and enhanced run-time performance.This paper presents a survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years. This work is placed in the context of other available DSP implementation media including ASICs and PDSPs to fully document the range of design choices available to system engineers. It is shown that while contemporary reconfigurable computing can be applied to a variety of DSP applications including video, audio, speech, and control, much work remains to realize its full potential. While individual implementations of PDSP, ASIC, and reconfigurable resources each offer distinct advantages, it is likely that integrated combinations of these technologies will provide more complete solutions. 相似文献

17.

A Knapsack Model for Bandwidth Management of Prerecorded Multiple MPEG Video Sources

Erten Y. Murat Güllü Refik Süral Haldun Neftċi Sınan 《Telecommunication Systems》2005,28(1):101-116

In this article we provide a framework for controlling the bit rate of multiple prerecorded MPEG video sequences by choosing the quantization factors assigned to individual sources in a way that the total mean square error at the output of the encoder is minimized. We propose and test a knapsack model for the selection of the quantization factors. Our computations based on a set of relatively diverse video sequences reveal that the proposed model achieves a high utilization of the available bandwidth and acceptable distortion levels without any data loss. 相似文献

18.

IIR滤波器的DSP实现

下载免费PDF全文

谢海霞孙志雄《电子器件》2013,36(2):194-196

IIR滤波器用较少阶数就获得较高的频率选择特性。根据滤波器的技术指标,采用FDATool工具来设计阶数最小的滤波器,导出滤波器的抽头系数;然后用DSP汇编语言编程,实现IIR算法;用MATLAB产生合成输入信号,导入滤波系统;最后,用CCS软件观察滤波前后的波形变化。波形仿真结果和理论值相吻合。相似文献

19.

A Compiler-Friendly RISC-Based Digital Signal Processor Synthesis and Performance Evaluation

Jiyang Kang Jongbok Lee Wonyong Sung 《The Journal of VLSI Signal Processing》2001,27(3):297-312

As DSP (Digital Signal Processing) applications become more complex, there is also a growing need for new architectures supporting efficient high-level language compilers. We try to synthesize a new DSP processor architecture by adding several DSP processor specific features to a RISC core that has a compiler friendly structure, such as many general-purpose registers and orthogonal instructions. The synthesized digital signal processor supports single-cycle MAC (Multiply-and-ACcumulate), direct memory access, automatic address generation, and hardware looping capabilities in addition to ordinary RISC instructions. The compiler for the new architecture is quickly implemented by developing a code-converter that modifies the assembly codes that are generated by the RISC compiler. The performance effects of adding each of these as well as all the combined features are evaluated using seven DSP-kernel benchmarks, a QCELP vocoder, and an MPEG video decoder. The effects of CPU clock frequency change due to the addition of these features are also considered. Finally, we also compare the performances with several existing DSP processors, such as TMS320C3x, TMS320C54x, and TMS320C5x. 相似文献

20.

FIR filters with field-programmable gate arrays

Les Mintzer 《The Journal of VLSI Signal Processing》1993,6(2):119-127

Distributed arithmetic techniques are the key to efficient implementation of DSP algorithms in FPGAs. The distributed arithmetic process is briefly described. A representative DSP design application in the form of an 8 tap FIR filter is offered for the Xilinx XC3042 field programmable logic array (FPGA). The design is presented in sufficient detail—from filter specifications via filter design software through detailed logic of salient data and control functions to obtain a realistic placing and routing of configurable logic block (CLBs) and in/out block (IOBs) components for simulation verification and performance evaluation vis-a-vis commercially available dedicated 8 tap FIR filter chips. 相似文献