首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 10 毫秒
PLX is a concise instruction set architecture (ISA) that combines the most useful features from previous generations of multimedia instruction sets with newer ISA features for high-performance, low-cost multimedia information processing. Unlike previous multimedia instruction sets, PLX is not added onto a base processor ISA, but designed from the beginning as a standalone processor architecture optimized for media processing. Its design goals are high performance multimedia processing, general-purpose programmability to support an ever-growing range of applications, simplicity for constrained environments where low power and low cost are paramount, and scalability for higher performance in less constrained multimedia systems. Another design goal of PLX is to facilitate exploration and evaluation of novel techniques in instruction set architecture, microarchitecture, arithmetic, VLSI implementations, compiler optimizations, and parallel algorithm design for new computing paradigms.Key characteristics of PLX are a fully subword-parallel architecture with novel features like wordsize scalability from 32-bit to 128-bit words, a new definition of predication, and an innovative set of subword permutation instructions. We demonstrate the use and high performance of PLX on some frequently-used code kernels selected from image, video, and graphics processing applications: discrete cosine transform, pixel padding, clip test, and median filter. Our results show that a 64-bit PLX processor achieves significant speedups over a basic 64-bit RISC processor and over IA-32 processors with MMX and SSE multimedia extensions. Using PLXs wordsize scalability feature, PLX-128 often provides an additional 2× speedup over PLX-64 in a cost-effective way. Superscalar or VLIW (Very Long Instruction Word) PLX implementations can also add additional performance through inter-instruction, rather than intra-instruction parallelism. We also describe the PLX testbed and its software tools for architecture and related research.Ruby B. Lee is the Forrest G. Hamrick Professor of Engineering and Professor of Electrical Engineering at Princeton University, with an affiliated appointment in the Computer Science department. She is the founder and director of the Princeton Architecture Laboratory for Multimedia and Security (PALMS). Her current research is in rethinking computer architecture for high-performance but low-cost security and multimedia processing. Prior to joining the Princeton faculty in 1998, Dr. Lee served as chief architect at Hewlett-Packard, responsible at different times for processor architecture, multimedia architecture, and security architecture for e-commerce and extended enterprises. She was a key architect in the initial definition and the evolution of the PA-RISC processor architecture used in HP servers and workstations. As chief architect for HPs multimedia architecture team, Dr. Lee led an inter-disciplinary team focused on architecture to facilitate pervasive multimedia information processing using general-purpose computers. She introduced innovative multimedia instruction set architecture (MAX and MAX-2) in microprocessors, resulting in the industrys first real-time, high-fidelity MPEG video and audio player implemented in software on low-end desktop computers. Dr. Lee also co-led an HP-Intel multimedia architecture team for IA-64, released in Intels Itanium microprocessors. Concurrent with full-time employment at HP, Dr. Lee also served as Consulting Professor of Electrical Engineering at Stanford University. Dr. Lee has a Ph.D. in Electrical Engineering and a M.S. in Computer Science, both from Stanford University, and an A.B. from Cornell University, where she was a College Scholar. She is a Fellow of ACM, a Fellow of IEEE, and a member of IS&T, Phi Beta Kappa, and Alpha Lambda Delta. She has been granted 115 U.S. and international patents, with several patent applications pending.A. Murat Fiskiran is a Ph. D. student at the Department of Electrical Engineering at Princeton University. He is a member of the Princeton Architecture Laboratory for Multimedia and Security (PALMS) and a Kodak Fellow. His research interests include computer architecture and computer security.  相似文献   

MEMS加速度传感器大幅提高了新型地震检波器的各项性能指标。利用有限元软件AN-SYS建立了悬臂硅梁的力学模型,并通过其力学性能仿真,得出优化的设计结构尺寸,即梁长L=150μm,梁宽b=40μm,梁厚h=4μm,活动电极和固定电极的间距d0取为1μm。同时利用电路仿真软件建立了传感器闭环系统的整体仿真数学模型,仿真结果表明其阶跃响应和正弦响应基本和理论分析结果吻合,传感器的分辨率可达0.001 m/s2,频带宽度可达500 MHz。该基于MEMS加速度传感器的新型地震检波器在地震勘探中将具有广阔的应用前景。  相似文献   

Architecture for Dynamically Reconfigurable Embedded Systems (ADRES) is a templatized coarse-grained reconfigurable processor architecture. It targets at embedded applications which demand high-performance, low-power and high-level language programmability. Compared with typical very long instruction word-based digital signal processor, ADRES can exploit higher parallelism by using more scalable hardware with support of novel compilation techniques. We developed a complete tool-chain, including compiler, simulator and HDL generator. This paper describes the design case of a media processor targeting at H.264 decoder and other video tasks based on the ADRES template. The whole processor design, hardware implementaiton and application mapping are done in a relative short period. Yet we obtain C-programmed real-time H.264/AVC CIF decoding at 50 MHz. The die size, clock speed and the power consumption are also very competitive compared with other processors.
S. DupontEmail:

岳梦云  白冰 《电子学报》2000,48(10):2041-2046
本文设计了一种适用于电机矢量控制算法的数字信号处理系统的微架构定义,包括其指令集定义、存储器模型以及与主CPU的交互模式.该设计具有通过固定部分多操作数有效缩减指令编码长度提高代码密度以及后台执行多周期指令提高ALU并行效率的显著优点.文中给出了典型的FOC控制算法在DSP (Digital Signal Processor)指令集上实现的指令周期数,也给出了对应架构的电路实现情况,最终以ARM CORTEX-M0及几款主流DSP作为比较基线,通过实测实验数据证明了体系结构的高能效比,以较为有限的电路面积代价,极大提高了集成DSP的嵌入式系统的运行效率.  相似文献   

介绍了TMS320C6205 DSP的基本特点和功能,并详细介绍了如何采用TMS320C6205 DSP并结合其它器件来实现一个多媒体信号采集处理系统。  相似文献   

Jaesung Lee 《ETRI Journal》2010,32(4):540-547
One of the critical issues in on‐chip serial communications is increased power consumption. In general, serial communications tend to dissipate more energy than parallel communications due to bit multiplexing. This paper proposes a low‐power bus serialization method. This encodes bus signals prior to serialization so that they are converted into signals that do not greatly increase in transition frequency when serialized. It significantly reduces the frequency by making the best use of word‐to‐word and bit‐by‐bit correlations presented in original parallel signals. The method is applied to the revision of an MPEG‐4 processor, and the simulation results show that the proposed method surpasses the existing one. In addition, it is cost‐effective when implemented as a hardware circuit since its algorithm is very simple.  相似文献   

可重构结构设计空间快速搜索方法   总被引:1,自引:0,他引:1  
在可重构结构评估模型的基础上,研究了在算法级估计可重构结构的面积、性能和功耗的方法。根据面积、性能和功耗,分两步搜索可重构结构的设计空间。首先,搜索结构域中每个结构实现所有算法时的最小代价,其次,在结构设计空间中搜索最优结构。该方法不依赖任何具体的架构,全面评价可重构结构的优劣,能快速获得全局最优的搜索结果。应用实例表明,在可重构结构设计初期,该方法能有效地指导可重构结构的设计。  相似文献   

随着图形特征尺寸的不断缩小、集成度的不断提高,集成电路已进入纳米系统芯片(SOC)阶段,摩尔定律依靠器件尺寸缩小得以延续的方式正面临着众多挑战。分析了纳米SOC中影响性能和良品率的关键效应及相应的措施。从半导体产业链的发展演变指出了可制造性设计(DFM)是纳米SOC阶段提高可制造性与良品率的解决方案。与光刻性能相关的分辨率增强技术(RET)是推动DFM发展的第一波浪潮,下一代的DFM将更注重良品率的受限分析及设计规则的综合优化。综述了DFM产生的历史及发展的现状,并对其前景进行了展望。  相似文献   

The exploration of the design space for heterogeneous Systems on Chip (SoC) becomes more and more important. As modern SoCs include a variety of different architecture blocks ensuring flexibility as well as highest performance, it is mandatory to prune the design space in an early stage of the design process in order to achieve short innovation cycles for new products. Thus, the goal of this work is to provide estimations of implementation specific parameters like throughput rate, power dissipation and silicon area by means of cost functions featuring reasonable accuracy at low modeling effort. A model based exploration strategy supporting the design flow for heterogeneous SoCs is presented. In order to demonstrate the feasibility of this exploration strategy, in a first step implementation cost parameters are provided for a variety of basic operations frequently required in digital signal processing which were implemented on discrete components like DSPs, FPGAs or dedicated ASICs. These implementation parameters serve as a basis for deriving cost models for the design space exploration concept.Holger Blume received his Dipl.-Ing. degree in electrical engineering from the University of Dortmund, Germany in 1992. From 1993 to 1998 he worked as a research assistant with the Working group on Circuits and Systems for Information Processing of Prof. Dr. H. Schröder in Dortmund. There he finished his PhD on Nonlinear fault tolerant interpolation of intermediate images in 1997. In 1998 he joined the Chair of Electrical Engineering and Computer Systems of Prof. Dr. T. G. Noll at the University of Technology RWTH Aachen as a senior engineer. His main research interests are in the field of heterogeneous reconfigurable Systems on Chip for multimedia applications. Dr. Blume is chairman of the German chapter of the IEEE Solid State Circuits Society.Hendrik T. Feldkaemper received the Dipl.-Ing. degree from the University of Technology RWTH Aachen, Germany, in 1997. After half a year of employment in an industrial project at Infineon Technologies in Munich he joined the Chair of Electrical Engineering and Computer Systems (Prof. Dr. T. G. Noll), University of Technology RWTH Aachen as a research assistant. His current research interest include design space exploration for digital signal processing in ultrasound, heterogeneous reconfigurable Systems-on-Chip and VLSI CMOS design.Tobias G. Noll received the Ing. (grad.) degree in Electrical Engineering from the Fachhochschule Koblenz, Germany in 1974, the Dipl-Ing. degree in Electrical Engineering from the Technical University of Munich in 1982, and the Dr.-Ing. degree from the Ruhr-University of Bochum in 1989.From 1974 to 1976, he was with the Max-Planck-Institute of Radio Astronomy, Bonn, Germany, being active in the development of microwave waveguide and antenna components. From 1976 to 1982, he was with the MOS Integrated Circuits Department and from 1982 to 1984, the MOS-Design Team trainee program of Siemens AG, Munich. In 1984, he joined the Corporate Research and Development Department of Siemens, and since 1987, he has headed a group of laboratories concerned with the design of algorithm-specific integrated CMOS circuits for high speed digital signal processing.Since 1992, he has been a Professor for Electrical Engineering and Computer Systems with the University of Technology (RWTH), Aachen, Germany. In addition to teaching, he is involved in research activities on VLSI architectural strategies for high-speed digital signal processing, circuit concepts, and design methodologies, as well as on digital signal processing for medicine electronics.  相似文献   

CMOS图象传感器技术及其研究进展*   总被引:10,自引:0,他引:10  
简要介绍了图象传感器的技术原理,比较了CCDs和CMOS图象传感器的技术特点。通过了解单片CMOS图象传感器的系统结构功能与器件类型,分析了单片CMOS图象传感器的性能要求与技术难点,总结出了提高性能所要进一步研究的关键问题。  相似文献   

创建Proteus原理图仿真模型的制作技术   总被引:3,自引:1,他引:2  
Proteus是单片机应用系统的设计与仿真平台,仿真模型是Proteus设计与仿真的基础,在实际应用中用户有必要创建Proteus库中尚无的仿真模型,这也是Proteus重要的深层次应用问题。以创建6位D/A转换器和TTL7458原理图模型为例,论述创建Proteus原理图仿真模型的思路与方法、模型存库与从库中调用他的方法和对创建模型进行验证的方法。经验证证明所建模型和建模方法都是正确的。  相似文献   

Wavelength division multiplexing (WDM) is emerging as a viable solution to reduce the electronic processing bottleneck in very high-speed optical networks. A set of parallel and independent channels are created on a single fiber using this technique. Parallel communication utilizing the WDM channels may be accomplished in two ways: (i) bit serial, where each source-destination pair communicates using one wavelength and data are sent serially on this wavelength; and (ii) bit parallel, where each source-destination pair communicates using a subset of channels and data are sent in multiple-bit words. Three architectures are studied in the paper: single-hop bit-serial star, single-hop bit-parallel star, and multi-hop bit-parallel shufflenet. The objective of this paper is to evaluate these architectures with respect to average packet delay, network utilization, and link throughput. It is shown that the Shufflenet offers the lowest latency but suffers from high cost and low link throughput. The star topology with bit-parallel access offers lower latency than the bit-serial star, but is more expensive to implement.  相似文献   

随着时代的不断发展,人们生活水平越来越高,至2013年底,据不完全统计,我国机动车数量约为3亿辆,这一情况给我过交通系统带来了巨大的挑战。政府部门面对这一情况,想通过扩宽道路、加大道路修建以及诸多高科技手段来减轻交通方面的压力,但是这些方式只能缓解一时之急,却不是长久之计。笔者认为,通过控制交通信号灯来改善目前机动车给交通系统带来的压力,才是解决该问题长久有效的方法。  相似文献   

随着通信对抗装备的复杂性增加和对其形成作战能力的要求不断提出,装备训练的问题日益突出。阐述了把仿真嵌入实装的需求层次和思路,并提出了相应的实现途径和总体框架。  相似文献   

周少东  茚邦琴 《电子器件》1999,22(3):171-176
随着通信系统复杂性的增加,传统的设计方法已经不能适应发展的需要,进行通信系统模拟仿真的研究。开发一个高效的通信和系统模拟仿真环境已经成为目前的迫切需要。  相似文献   

为满足综合电子信息系统复杂的仿真需求,提出建立综合电子信息系统仿真试验平台,通过分析综合电子信息系统对一体化仿真环境的需求,提出了一种开放式、可扩展的仿真环境;介绍了仿真系统的体系结构和功能组成;采用可重用仿真模型的开发过程,对系统仿真模型进行分类和重用组合,并给出了综合电子信息系统仿真环境中的仿真实例;综合电子信息系统仿真环境可扩展性好、通用性强,方便系统仿真环境的更新和扩展移植,可用于验证综合电子信息系统的作战性能、论证系统的作战需求。  相似文献   

光电探测系统仿真技术的现状与分析   总被引:1,自引:0,他引:1  
光电探测系统仿真是近年来世界各国重点发展和研究的一种应用仿真技术。本文简要介绍了光电探测仿系统的一般组成、工作原理、对仿真设备的主要技术要求以及关键技术,论述和分析了国内外光电探测系统仿真技术的应用情况和发展现状。  相似文献   

关于赛博作战装备与技术体系建设的几点思考   总被引:1,自引:1,他引:0  
概述了世界赛博作战装备与技术的发展现状并引出了构建赛博作战装备与技术体系的重要性;在此基础上,介绍了建立赛博作战装备与技术体系所需前提;最后,从赛博作战装备与技术顶层体系、作战体系、技术体系角度阐述了体系建设的思考.  相似文献   

We explain a systematic way of interfacing data-flow hardware accelerators (IP) for their integration in a system on chip. We abstract the communication behaviour of the data flow IP so as to provide basis for an interface generator. Then we measure the throughput obtained for different architectures of the interface mechanism by a cycle accurate bit accurate simulation of a SoC integrating a data-flow IP. We show in which configuration the optimal communication scheme can be reached.
Tanguy Risset (Corresponding author)Email:

The compiler is generally regarded as the most important software component that supports a processor design to achieve success. This paper describes our application of the open research compiler infrastructure to a novel VLIW DSP (known as the PAC DSP core) and the specific design of code generation for its register file architecture. The PAC DSP utilizes port-restricted, distributed, and partitioned register file structures in addition to a heterogeneous clustered data-path architecture to attain low power consumption and a smaller die. As part of an effort to overcome the new challenges of code generation for the PAC DSP, we have developed a new register allocation scheme and other retargeting optimization phases that allow the effective generation of high quality code. Our preliminary experimental results indicate that our developed compiler can efficiently utilize the features of the specific register file architectures in the PAC DSP. Our experiences in designing compiler support for the PAC VLIW DSP with irregular resource constraints may also be of interest to those involved in developing compilers for similar architectures.
Jenq-Kuen Lee (Corresponding author)Email:

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号