首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Architectural concepts are presented aimed at future multimedia processing schemes. Starting from an analysis of current and future multimedia applications, specific computational requirements are derived. It will be shown that multimedia applications benefit from an exhaustive and flexible exploitation of parallelism. Three architectural concepts—reconfigurable computing, simultaneous multithreading, and associative controlling—are presented, and their potential to increase further the performance on future multimedia applications is investigated.  相似文献   

2.
A methodological framework for performance estimation of multimedia signal processing applications on different implementation platforms is presented. The methodology derives a complexity profile which is characteristic for an application, but completely platform-independent. By correlating the complexity profile with platform-specific data, performance estimation results for different platforms are obtained. The methodology is based on a reference software implementation of the targeted application, but is, in constrast to instruction-level profiling-based approaches, fully independent of its optimization degree. The proposed methodology is demonstrated by example of an MPEG-4 Advanced Simple Profile (ASP) video decoder. Performance estimation results are presented for two different platforms, a specialized VLIW media processor and an embedded general-purpose RISC processor, showing a high accuracy of he methodology. The approach can be employed to assist in design decisions in the specification phase of new architectures, in the selection process of a suitable target platform for a multimedia application, or in the optimization stage of a software implementation on a specific platform.Hans-Joachim Stolberg received the Dipl.-Ing. degree in electrical engineering from the University of Hannover, Germany, in 1995.From 1995 to 1996, he worked at the NEC Information Technology Research Laboratories, Kawasaki, Japan, on efficient implementation of video compression algorithms. Since 1996, he has been with the Institute of Microelectronic Systems at the University of Hannover as a Research Assistant. During summer 2001, he was a Monbukagakusho Research Fellow at the Tokyo Institute of Technology, Japan. His current research interests include VLSI architectures for video signal processing, performance estimation of multimedia schemes, and profile-guided memory organization approaches for signal processing and multimedia applications.Mladen Bereković received the Dipl.-Ing. degree in electrical engineering from the University of Hannover, Germany, in 1995.Since then he has been a Research Assistant with the Institute of Microelectronic Systems of the University of Hannover. His current research interests include VLSI architectures for video signal processing, MPEG-4, System-on-Chip (SOC) designs, and simultaneously multi-threaded (SMT) processor architectures.Peter Pirsch received the Ing. grad. degree from the engineering college in Hannover, Germany, in 1966, and the Dipl.-Ing. and Dr.-Ing. degrees from the University of Hannover, in 1973 and 1979, respectively, all in electrical engineering.From 1966 to 1973 he was employed by Telefunken, Hannover, working in the Television Department. He became a Research Assistant at the Department of Electrical Engineering, University of Hannover, in 1973, a Senior Engineer in 1978. During 1979 to 1981 he was on leave, working in the Visual Communications Research Department, Bell Laboratories, Holmdel, NJ. During 1983 to 1986 he was Department Head for Digital Signal Processing at the SEL Research Center, Stuttgart, Germany. Since 1987 he is Professor in the Department of Electrical and Computer Engineering at the University of Hannover. He served as Vice President Research of the University of Hannover from 1998 to 2002.His present research includes architectures and VLSI implementations for image processing applications, rapid prototyping and design automation for DSP applications. He is the author or coauthor of more than 200 technical papers. He has edited a book on VLSI Implementations for Image Communications (Elsevier 1993) and is author of the book Architectures for Digital Signal Processing (John Wiley 1998).Dr. Pirsch is a member of the IEEE, the German Institute of Information Technology Engineers (ITG) and the German Association of Engineers (VDI). He was recipient of several awards: the NTG paper price award (1982), IEEE Fellow (1997), IEEE Circuits and Systems Golden Jubilee Medal (1999). He was member or chair of several technical program committees of international conferences and organizer of special sessions and preconference courses. He has held several administrative and technical positions with the IEEE Circuits and Systems Society and other professional organizations. Dr. Pirsch currently serves as Vice President Publications of the IEEE Circuits and Systems Society. Since 2000 he is chairman of the Accreditation Commission for Engineering and Informatics of the Accreditation Agency for Study Programs in Engineering, Informatics, Natural Science and Mathematics (ASIIN). Dr. Pirsch is chair of the VDI committee on Engineering Education.  相似文献   

3.
扩展了经典的网络应用建模工具Click,提出了面向基于片上网络(Network-on-Chip,NoC)的多核平台的DClick。定义了Element和报文在分布式系统中的部署方式,根据节点具有独立缓冲资源的特点裁减了Element之间的连接方式,定义了专门的报文节点访问机制,给出了消息驱动的Element调度流程。详细介绍了一种DClick的实现方式,给出了原有的Click库与NoC模拟器协同的方法,证实了DClick的有效性和易用性。  相似文献   

4.
一种快速高效MPEG-4运动估计硬件结构的研究和实现   总被引:6,自引:0,他引:6  
提出一种高度并行和多流水线处理的硬件结构,实现MPEG-4视频部分的全搜索块匹配运动估计算法.该硬件结构能实时地通过全搜索块匹配运动估计算法来搜索每个像素块最佳匹配运动向量,具有计算速度高、运动向量准确、较少的内置存储器要求、低运行时钟和低功耗等诸多优点,从而可满足移动视频业务和高清晰视频业务的需求.该硬件结构基于富士通的CE66库实现.  相似文献   

5.
Affine transformation is widely used in image processing. Recently, it is recommended by MPEG-4 for video motion compensation. This paper presents a novel low power parallel architecture for texture warping using affine transformation (AT). The architecture uses a novel multiplication-free algorithm that employs the algebraic properties of the AT. Low power has been achieved at different levels of the design. At the algorithmic level, replacing multiplication operations with bit shifting saves the power and delay of using a multiplier. At the architecture level, low power is achieved by using parallel computational units, where the latency constraints and/or the operating latency can be reduced. At the circuit level, using low power building blocks (such as low power adders) contributes to the power savings. The proposed architecture is used as a computational kernel in video object coders. It is compatible with MPEG-4 and VRML standards. The architecture has been prototyped in 0.6 m CMOS technology with three layers of metal. The performance of the proposed architecture shows that it can be used in mobile and handheld applications.  相似文献   

6.
MPEG-4视频编码器象素压缩模块的VLSI结构设计   总被引:1,自引:0,他引:1  
文章设计了一种基于MPEG-4的视频压缩编码器中象素压缩模块的VLSI结构。该设计采用分布算式结构——NEDA作为DCT变换的核心技术;应用基于LUT表结构使量化/反量化模块的设计简洁明了;同时对AC/DC预测模块还应用了新的存储策略,大大降低了FPGA中宝贵的存储空间。在满足处理速度和精度的要求下,利用了较少的晶体管数目和简洁的结构实现了象素压缩模块。  相似文献   

7.
文章提出一种高效的VLSI结构,实现MPEG-4视频编码标准中二值形状的运动估值算法。我们称这种结构为DDBME。其主要由一个基于一维脉动阵列的数据分配器和16*32bit的搜索区域缓冲器组成。在DDBME中,采用数据位并行处理技术进行块匹配算法中绝对误差和(SAD)的计算。  相似文献   

8.
PLX is a concise instruction set architecture (ISA) that combines the most useful features from previous generations of multimedia instruction sets with newer ISA features for high-performance, low-cost multimedia information processing. Unlike previous multimedia instruction sets, PLX is not added onto a base processor ISA, but designed from the beginning as a standalone processor architecture optimized for media processing. Its design goals are high performance multimedia processing, general-purpose programmability to support an ever-growing range of applications, simplicity for constrained environments where low power and low cost are paramount, and scalability for higher performance in less constrained multimedia systems. Another design goal of PLX is to facilitate exploration and evaluation of novel techniques in instruction set architecture, microarchitecture, arithmetic, VLSI implementations, compiler optimizations, and parallel algorithm design for new computing paradigms.Key characteristics of PLX are a fully subword-parallel architecture with novel features like wordsize scalability from 32-bit to 128-bit words, a new definition of predication, and an innovative set of subword permutation instructions. We demonstrate the use and high performance of PLX on some frequently-used code kernels selected from image, video, and graphics processing applications: discrete cosine transform, pixel padding, clip test, and median filter. Our results show that a 64-bit PLX processor achieves significant speedups over a basic 64-bit RISC processor and over IA-32 processors with MMX and SSE multimedia extensions. Using PLXs wordsize scalability feature, PLX-128 often provides an additional 2× speedup over PLX-64 in a cost-effective way. Superscalar or VLIW (Very Long Instruction Word) PLX implementations can also add additional performance through inter-instruction, rather than intra-instruction parallelism. We also describe the PLX testbed and its software tools for architecture and related research.Ruby B. Lee is the Forrest G. Hamrick Professor of Engineering and Professor of Electrical Engineering at Princeton University, with an affiliated appointment in the Computer Science department. She is the founder and director of the Princeton Architecture Laboratory for Multimedia and Security (PALMS). Her current research is in rethinking computer architecture for high-performance but low-cost security and multimedia processing. Prior to joining the Princeton faculty in 1998, Dr. Lee served as chief architect at Hewlett-Packard, responsible at different times for processor architecture, multimedia architecture, and security architecture for e-commerce and extended enterprises. She was a key architect in the initial definition and the evolution of the PA-RISC processor architecture used in HP servers and workstations. As chief architect for HPs multimedia architecture team, Dr. Lee led an inter-disciplinary team focused on architecture to facilitate pervasive multimedia information processing using general-purpose computers. She introduced innovative multimedia instruction set architecture (MAX and MAX-2) in microprocessors, resulting in the industrys first real-time, high-fidelity MPEG video and audio player implemented in software on low-end desktop computers. Dr. Lee also co-led an HP-Intel multimedia architecture team for IA-64, released in Intels Itanium microprocessors. Concurrent with full-time employment at HP, Dr. Lee also served as Consulting Professor of Electrical Engineering at Stanford University. Dr. Lee has a Ph.D. in Electrical Engineering and a M.S. in Computer Science, both from Stanford University, and an A.B. from Cornell University, where she was a College Scholar. She is a Fellow of ACM, a Fellow of IEEE, and a member of IS&T, Phi Beta Kappa, and Alpha Lambda Delta. She has been granted 115 U.S. and international patents, with several patent applications pending.A. Murat Fiskiran is a Ph. D. student at the Department of Electrical Engineering at Princeton University. He is a member of the Princeton Architecture Laboratory for Multimedia and Security (PALMS) and a Kodak Fellow. His research interests include computer architecture and computer security.  相似文献   

9.
This paper presents the architectural design of a multicomputer interconnection network based on the use of optical technology. The performance of the system is evaluated on a set of signal processing applications. The interconnect uses Vertical Cavity Surface Emitting Lasers (VCSELs) and flexible fiber image guides to implement a physical ring topology that is logically configured as a multiring. Processors in the multicomputer are nodes on the ring and extremely high communication bandwidth is possible. Using the Laser Channel Allocation (LCA) algorithm and the Deficit Round Robin (DRR) media access protocol, the bandwidth available in the optical interconnect can be reconfigured to make efficient use of the interconnect resources. A discrete-event simulation model of the interconnect is used to examine performance issues such as throughput, latency, fairness, and the impact of reconfigurability.Roger D. Chamberlain completed the degrees BSCS and BSEE in 1983, MSCS in 1985, and DSc (computer science) in 1989 all from Washington University in St. Louis, Missouri. He is currently an Associate Professor of Computer Science and Engineering at Washington University, where he is Director of the Computer Engineering Program. Dr. Chamberlain teaches and conducts research in the areas of computer architecture, parallel computing, embedded systems, and digital design.Mark A. Franklin received his BA, BSEE and MSEE from Columbia University, and his Ph.D. in EE from Carnegie-Mellon University. He is currently a Professor in the Department of Computer Science and Engineering at Washington University in St. Louis, Missouri, and holds the Hugo F. and Ina Champ Urbauer Chair in Engineering. He founded and is former Director of the Computer and Communications Research Center.Dr. Franklin is a Fellow of the IEEE and a member of the ACM. He has been Chair of the IEEE TCCA (Technical Committee on Computer Architecture), and Vice-Chair of the ACM SIGARCH (Special Interest Group on Computer Architecture). His research areas include computer and systems architecture, ASIC and embedded processor design, parallel and distributed systems, and systems performance evaluation.Praveen Krishnamurthy received the Bachelor of Engineering degree from University of Madras (India) in 2000 and the MS degree in Computer Engineering from Washington University in St. Louis, Missouri, in 2002. He is currently a doctoral student at Washington University in St. Louis.Abhijit Mahajan received his B.E (Electronics) degree from University of Mumbai in 1998. He received is MSEE from Washington University in 2000. He is presently working with Broadcom Corporation in India. His main area of work is signal integrity and systems engineering.  相似文献   

10.
11.
王占辉  刘大明   《电子器件》2007,30(6):2112-2118
阐明了所设计的二进制运动估计的VLSI硬件结构.首先介绍了MPEG-4形状编码中BME的基本原理,以及前人设计的一种DDBME结构;接着分析指出DDBME在利用数据冗余方面的缺陷,并给出了一种改进的运动估计结构,通过扩大PE阵列以及重新安排数据的分发机制,可以达到75%的数据利用率.硬件仿真结果与软件运行结果的对比表明了本设计的正确性,综合结果也说明了本设计可以在最低5.93 MHz的频率下满足MPEG-4 Core Profile & Level2的实时编码,可用于MPEG-4的VLSI实现.  相似文献   

12.
A novel fully integrated dynamic thermal management circuit for system-on-chip design is proposed. Instead of worst-case thermal management used in conventional systems, this design yields continual monitoring of thermal activity and reacts to specified conditions. With the above system, we are able to incorporate on-chip power/speed modulation and integrated multi-stage fan controllers, which allows us to achieve nominal power dissipation and ensure operation within specification. Both architecture and circuitry are optimized for modern system-on-chip designs. This design yields intricate control and optimal mangement with little system overhead and minimum hardware requirements, as well as provides the flexibility to support different thermal mangement algorithms.  相似文献   

13.
The extent of pixel-parallel focal plane image processing is limited by pixel area and imager fill factor. In this paper, we describe a novel multi-chip neuromorphic VLSI visual motion processing system which combines analog circuitry with an asynchronous digital interchip communications protocol to allow more complex pixel-parallel motion processing than is possible in the focal plane. This multi-chip system retains the primary advantages of focal plane neuromorphic image processors: low-power consumption, continuous-time operation, and small size. The two basic VLSI building blocks are a photosensitive sender chip which incorporates a 2D imager array and transmits the position of moving spatial edges, and a receiver chip which computes a 2D optical flow vector field from the edge information. The elementary two-chip motion processing system consisting of a single sender and receiver is first characterized. Subsequently, two three-chip motion processing systems are described. The first three-chip system uses two sender chips to compute the presence of motion only at a particular stereoscopic depth from the imagers. The second three-chip system uses two receivers to simultaneously compute a linear and polar topographic mapping of the image plane, resulting in information about image translation, rotation, and expansion. These three-chip systems demonstrate the modularity and flexibility of the multi-chip neuromorphic approach.  相似文献   

14.
In this paper, we propose a reduced complexity and power efficient System-on-Chip (SoC) architecture for adaptive interference suppression in CDMA systems. The adaptive Parallel-Residue-Compensation architecture leads to significant performance gain over the conventional interference cancellation algorithms. The multi-code commonality is explored to avoid the direct Interference Cancellation (IC), which reduces the IC complexity from to . The physical meaning of the complete versus weighted IC is applied to clip the weights above a certain threshold so as to reduce the VLSI circuit activity rate. Novel scalable SoC architectures based on simple combinational logic are proposed to eliminate dedicated multipliers with at least saving in hardware resource. A Catapult C High Level Synthesis methodology is apply to explore the VLSI design space extensively and achieve at least speedup. Multi-stage Convergence-Masking-Vector combined with clock gating is proposed to reduce the VLSI dynamic power consumption by up to This paper was presented in part at IEEE ISCAS in Vancouver, Canada, May, 2004.  相似文献   

15.
师超  高谷刚  杨军  林博 《电子工程师》2006,32(1):15-17,44
离散余弦反变换(IDCT)广泛应用于MPEG-4等视频压缩、解压缩应用中。在嵌入式系统中,IDCT运算的效率将直接影响MPEG-4实时解码性能。文中根据嵌入式系统的特点,提出了一种新的IDCT硬件实现方法,并采用了一种新的验证手段对该硬件实现进行了全方位的验证。该方案已经应用于一款SoC芯片中的硬件MMA(多媒体加速单元)中。  相似文献   

16.
This paper presents a Computational Memory architecture for MPEG-4 applications with mobile devices. The proposed architecture is used for real-time block-based motion estimation, which is the most computational intensive task in the video encoder. It uses the exhaustive block-matching algorithm (EBMA) for motion estimation. The proposed architecture consists of embedded SRAMs and a number of block-matching units working in parallel to process video data while stored in the memory. The block-matching units access the embedded SRAMs simultaneously, which increases the speed of the architecture. The architecture processes CIF format video sequences (i.e., the frame size is 352 × 288 pixels) with block size of 16 × 16 pixels and ±15 pixels search range. The proposed architecture has been designed, prototyped, and simulated for 0.18 μm TSMC CMOS technology. The simulation shows that the proposed architectures processes up to 126 CIF frames per second with clock frequency 100 MHz. The synthesized prototype of the proposed architecture includes 200 KB memory and it has an area of 33.75 mm2 and consumes 986.96 mW @100 MHz. Mohammed Sayed received his B.Sc. degree from Zagazig University, Zagazig, Egypt, in 1997 and a postgraduate diploma in VLSI design from the Information Technology Institute (ITI), Cairo, Egypt, in 1998. In 2003 he received his M.Sc. degree from University of Calgary, Calgary, Canada. From 1998 to 2001 he was a research and teaching assistant at the Electronics & Communications Engineering Department, Zagazig University, Egypt. In 2001 he became a research assistant at the Department of Electrical and Computer Engineering, University of Calgary, Canada. His current research interests are System-on-Chip, Embedded Memories, and Digital Video Processing. Mr. Sayed received a number of scholarships and awards such as iCORE Scholarship from 2003 to 2005, SMC Industrial Collaboration Award in June 2003, and the Micronet Annual Workshop Best Paper Award in April 2002. He has a number of journal and conference publications and a number of contributions to the MPEG-4 standard (ISO/IEC JTC1/SC29/WG11 MPEG2002/ M8562 and M8563). Wael Badawy is an associate professor in the Department of Electrical and Computer Engineering. He holds an adjunct professor in the Department of Mechanical Engineering, University of Alberta. Dr. Badawy's research interests are in the areas of: Microelectronics, VLSI architectures for video applications with low-bit rate applications, digital video processing, low power design methodologies, and VLSI prototyping. His research involves designing new models, techniques, algorithms, architectures and low power prototype for novel system and consumer products. Dr. Badawy authored and co-authored more than 100 peer reviewed Journal and Conference papers and about 30 technical reports. He is the Guest Editor for the special issue on System on Chip for Real-Time Applications in the Canadian Journal on Electrical and Computer Engineering, the Technical Chair for the 2002 International Workshop on SoC for real-time applications, and a technical reviewer in several IEEE journals and conferences. He is currently a member of the IEEE-CAS Technical Committee on Communication. Dr. Badawy was honored with the “2002 Petro Canada Young Innovator Award”, “2001 Micralyne Microsystems Design Award” and the 1998 Upsilon Pi Epsilon Honor Society and IEEE Computer Society Award for Academic Excellence in Computer Disciplines. He is currently the Chairman of the Canadian Advisor Committee (CAC) and Head of the Canadian Delegation on ISO/IEC/JTC1/SC6 “Telecommunications and Information Exchange Between Systems”. Member, The Canadian Advisory Committee for the Standards Council of Canada—Subcommittee 29: Coding of Audio, Picture Multimedia and Hypermedia Information, and Canadian Delegate, The ISO/IEC MPEG standard committee. He is a voting Member on the VSI Alliance. He is also the Chair of the IEEE-Southern Alberta Society-Computer Chapter.  相似文献   

17.
杨亮  于宗光  魏敬和  桂江华  潘邈 《微电子学》2018,48(5):648-651, 656
设计实现了面向多通道阵列信号处理的可重构异构SoC。SoC集成了多通道阵列信号处理需要的多个硬件加速模块,有效提高了多通道阵列信号处理系统的计算能力。通过软件对各个算法模块的输入输出流向进行重构,达到了多通道阵列信号处理算法可重构的目的,扩展了SoC的适用范围。采用55 nm工艺进行设计,版图尺寸为6.2 mm×4.5 mm,规模约为1 000万门。流片后的测试结果验证了多通道阵列信号处理算法的有效性,证明了SoC设计的正确性。  相似文献   

18.
在采用外部存储和内部缓存的两级存储方案的基础上,提出了一种基于纹理图像的MPEG-4ASP@L5运动补偿电路的硬件结构,并完成了VLSI设计。针对运动向量的预测算法,在满足实时译码的前提下对电路的内部缓存LM2进行了优化。对于重叠块运动补偿算法,提出了一种有效的双循环替换缓存结构。采用TSMC0.25μm1P5MCMOS工艺,完成了运动补偿电路的VLSI实现,芯片内核面积为1.31mm×1.31mm,最高工作频率150MHz。系统仿真结果表明该电路可在120MHz的频率下对符合ASProfile标准的ITU-R601格式的纹理视频流进行实时运动补偿。  相似文献   

19.
小信号模型分析法是"模拟电子技术"课程中的重点和难点内容之一.笔者应用PowerPoint,开发出了多媒体课件.通过案例、动画以及EWB仿真等多种途径,在提高学员能力、培养学员素质等方面作了一些尝试,并取得了较好的教学效果.本文主要介绍了该课件的系统模块、设计思想和使用情况.  相似文献   

20.
This paper describes a low-power programmable DSP architecture that targets audio signal processing. The architecture can be characterized as a heterogeneous multiprocessor consisting of small instruction set processors called mini-cores as well as standard DSP and CPU cores that communicate using message passing. The mini-cores are tailored for different classes of filtering algorithms (FIR, IIR, N-LMS etc.), and in a typical system the communication among processors occur at the sampling rate only.The mini-cores are intended as soft-macros to be used in the implementation of system-on-chip solutions using a synthesis-based design flow targeting a standard-cell implementation. They are parameterized in word-size, memory-size, etc. and can be instantiated according to the needs of the application. To give an impression of the size of a mini-core we mention that one of the FIR mini-cores in a prototype design has 16 instructions, a 32-word × 16-bit program memory, a 64-word × 16-bit data memory and a 25-word × 16-bit coefficient memory.Results obtained from the design of a prototype chip containing mini-cores for a hearing aid application, demonstrate a power consumption that is only 1.5–1.6 times larger than a hardwired ASIC and more than 6–21 times lower than current state of the art low-power DSP processors. This is due to: (1) the small size of the processors and (2) a smaller instruction count for a given task.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号